0

I have a list of lists like this:

[[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]], etc.]

If the first and second element of an inner list is the same as the first and second element of another inner list (like the example above), I want to create a function that adds the remaining values and merges them into one list. The example output would be like this:

[12411.0, 31937, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.3, 0.2, 0.25, 0.25, 0.25, 0.25]

I'm having trouble how to tell Python to initially recognize and compare the two elements of the list before merging them together. Here is my best attempt so far:

def group(A):
for i in range(len(A)):
    for j in range(len(A[i])):
        if A[i][0:1] == A[i: ][0:1]:
            return [A[i][0], A[i][1], sum(A[i][j+2], A[i: ][j+2])]

I get an index error, I believe, because of the A[i: ] and A[i: ][j+2] parts of the code. I don't know how to phrase it though in Python to tell the function to add any other lines that meet the criteria.

3
  • first your slicing is off by 1. A[i][0:1] will only return the first element of the list A[i]. in general, [i:j] given you i to j-1 Commented Jul 18, 2014 at 1:26
  • second, its not clear what you're after. will the number of internal lists always be 2? how many and which internal lists have to match their first two arguments for the sum to proceed? Commented Jul 18, 2014 at 1:27
  • What is the expected behavior if the first two elements of the inner lists are not the same? Commented Jul 18, 2014 at 1:36

4 Answers 4

3

Here's a function that will merge all sublists where the first two entries match. It also handles cases where the sub-lists are not the same length:

from itertools import izip_longest

l = [[1,3,4,5,6], [1,3,2,2,2], [2,3,5,6,6], [1,1,1,1,1], [1,1,2,2,2], [1,3,6,2,1,1,2]]
l2 = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1,  0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]

def merge(l):
    d = {}
    for ent in l:
        key = tuple(ent[0:2])
        merged = d.get(key, None)
        if merged is None:
            d[key] = ent
        else:
            merged[2:] = [a+b for a,b in izip_longest(merged[2:], ent[2:], fillvalue=0)]
    return d.values()

print merge(l)
print merge(l2)

Output:

[[1, 3, 12, 9, 9, 1, 2], [2, 3, 5, 6, 6], [1, 1, 3, 3, 3]]
[[12411.0, 31937.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.30000000000000004, 0.2, 0.25, 0.25, 0.25, 0.25]]

It's implemented by maintaining a dict where the keys are the first two entries of a sub-list (stored as a tuple). As we iterate over the sublists, we check to see if there's an entry in the dict. If there isn't, we store the current sublist in the dict. If there already is an entry, we add up all their values from index 2 onward, and update the dict. Once we're one iterating, we just return all the values from the dict.

Sign up to request clarification or add additional context in comments.

Comments

3

This is one way to do it:

>>> a_list = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
>>> result = [a + b for a, b in zip(*a_list)]
>>> result[:2] = a_list[0][:2]
>>> result
[12411.0, 31937.0, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.25, 0.2, 0.25, 0.30000000000000004, 0.2, 0.25, 0.25, 0.25, 0.25]

This works by blindly adding up corresponding elements in all the sub-lists by doing:

[a + b for a, b in zip(*a_list)]

And then rewriting the first two elements of the result which according to the question does not change, by doing:

result[:2] = a_list[0][:2]

It is not evident from your question, as to what should the behavior be if the first two elements of the sub lists do not match. But the following snippet will help you check if the first two elements of the sub lists match. Lets assume a_list contains sublists whose first two elements do not match:

>>> a_list = [[12411.0, 31937.0, 0.1, 0.1], [12411.3, 31937.0, 0.1, 0.1]]

then, this condition:

all([True if list(a)[1:] == list(a)[:-1] else False for a in list(zip(*a_list))[:2]])

will return False. True otherwise. The code extracts the first elements and second elements of all the sub lists and then checks if they are equal.

You can include the above check in your code and modify your code accordingly for the expected behavior.

To sum it up:

a_list = [[12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.1, 0.15, 0.2, 0.1, 0.15, 0.15, 0.15, 0.15], [12411.0, 31937.0, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]]
check = all([True if list(a)[1:] == list(a)[:-1] else False for a in list(zip(*a_list))[:2]])
result = []
if check:
    result = [a + b for a, b in zip(*a_list)]
    result[:2] = a_list[0][:2]
else:
    # whatever the behavior should be.

1 Comment

The OP is saying he wants to add up the sublists which have matching entries in index 0 and 1. Meaning there could be sublists which do not match. He just didn't provide an example list without matches.
1

This is a function that will take a list of lists A and check internal list i and j using your criteria. It will then either return the summed list you want or None if the first two elements don't match.

def check_internal_ij(A,i,j):
    """ checks internal list i against internal list j """ 
    if A[i][0:2] == A[j][0:2]:
        new = [x+y for x,y in zip( A[i], A[j] )]
        new[0:2] = A[i][0:2]
        return new
    else:
        return None

Then you can run the function over all combinations of internal lists you want to check.

Comments

1

If you are fond of itertools with a little effort, this can easily be solved by playing around with groupby, islice, izip, imap and chain.

And off course you should also remember to use operator.itemgetter

Implementation

# Create a group of lists where the key (the first two elements of the lists) matches
groups = groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))
# zip the lists and then chop of the first two elements. Sum the elements of the resultant list
# Remember to add the newly accumulated list with the first two elements
groups_sum = ([k, imap(sum, islice(izip(*g), 2, None))] for k, g in groups )
# Reformat the final list to match the output format
[list(chain.from_iterable(elem)) for elem in groups_sum]

Implementation (If you are a fan of single liner)

[list(chain.from_iterable([k, imap(sum, islice(izip(*g), 2, None))]))
  for k, g in groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))]

Sample Input

l = [[10,20,0.1,0.2,0.3,0.4],
     [11,22,0.1,0.2,0.3,0.4],
     [10,20,0.1,0.2,0.3,0.4],
     [11,22,0.1,0.2,0.3,0.4],
     [20,30,0.1,0.2,0.3,0.4],
     [10,20,0.1,0.2,0.3,0.4]]

Sample Output

[[10, 20, 0.3, 0.6, 0.9, 1.2],
 [11, 22, 0.2, 0.4, 0.6, 0.8],
 [20, 30, 0.1, 0.2, 0.3, 0.4]]

Dissection

groups = groupby(sorted(l, key = itemgetter(0, 1)), key = itemgetter(0, 1))
# After grouping, similar lists gets clustered together
[((10, 20),
  [[10, 20, 0.1, 0.2, 0.3, 0.4],
   [10, 20, 0.1, 0.2, 0.3, 0.4],
   [10, 20, 0.1, 0.2, 0.3, 0.4]]),
 ((11, 22), [[11, 22, 0.1, 0.2, 0.3, 0.4], [11, 22, 0.1, 0.2, 0.3, 0.4]]),
 ((20, 30), [[20, 30, 0.1, 0.2, 0.3, 0.4]])]

groups_sum = ([k, imap(sum, islice(izip(*g), 2, None))] for k, g in groups )
# Each group is accumulated from the second element onwards
[[(10, 20), [0.3, 0.6, 0.9, 1.2]],
 [(11, 22), [0.2, 0.4, 0.6, 0.8]],
 [(20, 30), [0.1, 0.2, 0.3, 0.4]]]

[list(chain.from_iterable(elem)) for elem in groups_sum]
# Now its just a matter of representing in the output format
[[10, 20, 0.3, 0.6, 0.9, 1.2],
 [11, 22, 0.2, 0.4, 0.6, 0.8],
 [20, 30, 0.1, 0.2, 0.3, 0.4]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.