1

I have the following kind of list:

myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]

I don't want to have values with the same first element, so i wanted to do this: if two or more elements have the same first value, sum all the second values of the element with the same first value and remove the duplicates, so in my example the new output would be:

myList = [[500, 18], [504, 9], [505, 30]]

How can i do this? I was thinking of using Lambda functions but i don't know how to create the function; other solutions i'm thinking about require a massive amount of for loops, so i was thinking if there is an easier way to do this. Any kind of help is appreciated!

2
  • And what have you tried ? This is kind of simple with a few loops Commented Jun 7, 2020 at 13:39
  • Yes, i can do it with for loops, it's just that i wanted to see if there was a shorter solutions Commented Jun 7, 2020 at 13:46

5 Answers 5

5

Use a defaultdict:

import collections

# by default, non-existing keys will be initialized to zero
myDict = collections.defaultdict(int)

for key, value in myList:
    myDict[key] += value

# transform back to list of lists
myResult = sorted(list(kv) for kv in myDict.items())
Sign up to request clarification or add additional context in comments.

7 Comments

The normal way to use defaultdict is to pass a type, so defaultdict(int) would be the same as your lambda.
Yes, this is very clear! Thank you a lot, i will accept in 7 minutes!
@JanChristophTerasa you also don't really need the list comp - can just pass a gen-exp to sorted instead, eg: sorted(list(kv) for kv in myDict.items())
To be pedantic, new keys aren’t initialized to zero, but rather the values of keys that are attempt to be accessed but do not already exist get initialized to zero.
@SethMMorton That's correct, I just wanted to have a short comment there. When in doubt, read the docs.
|
3

using the pandas library:

[[k, v] for k, v in pd.DataFrame(myList).groupby(0).sum()[1].items()]

Breaking it down:

  • pd.DataFrame(myList) creates a DataFrame where each row is one of the short lists in myList:

        0   1
    0   500 5
    1   500 10
    2   500 3
    3   504 9
    4   505 10
    5   505 20
    
  • (...).groupby(0)[1].sum() groups by the first column, takes the values from the second one (to create a series instead of a dataframe) and sums them

  • [[k,v] for k, v in (...).items()] is simple list comprehension (treating the series as a dictionary), to output it back as a list like you wanted.

Output:

[[500, 18], [504, 9], [505, 30]]

The list comprehension can be made even shorter by casting each of the .items() to a list:

list(map(list, pd.DataFrame(myList).groupby(0)[1].sum().items()))

2 Comments

pandas is a bit overkill unless user already knows pandas (multiple) APIs
OP already said they know how to do it with for-loops and want to learn easier ways. Well, learning to work with collections or pandas makes such questions easier. The way that takes the least amount of learning is with for-loops - which OP already knows...
1

An easier to read implementation (less pythonesqe though :-) )

myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]


sums = dict()
for a,b in myList:
    if a in sums:
        sums[a] += b
    else:
        sums[a] = b

res = []
for key,val in sums.items():
    res.append([key,val])

print (sorted(res))

1 Comment

You can shorten a,b = item[0],item[1] to a,b = item. And more : for a,b in myList
1

You can use itertools groupby to group the sublists by the first item in the sublist, sum the last entries in the sublist, and create a new list of group keys, with the sums :

from itertools import groupby

from operator import itemgetter

 #sort data
 #unnecessary IMO, since data looks sorted
 #it is however, required to sort data
 #before running the groupby function

 myList = sorted(myList, key = itemgetter(0))

Our grouper will be the first item in each sublist (500, 504, 505)

 #iterate through the groups
 #sum the ends of each group
 #pair the sum with the grouper
 #return a new list

result = [[key, sum(last for first, last in grp)] 
           for key, grp 
           in groupby(myList, itemgetter(0))]

print(result)

[[500, 18], [504, 9], [505, 30]]

Comments

-1
myList = [[500, 5], [500, 10], [500, 3], [504, 9], [505, 10], [505, 20]]

temp = {}

for first, second in myList:
  if first in temp:
    temp[first] += second
  else:
    temp[first] = second

result = [[k, v] for k, v in temp.items()]
print(result)

1 Comment

This is very similar to Jean-Marc Volle's answer (except for the list comparison at the end)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.