1

How can I join the multiple lists in a Pandas column 'B' and get the unique values only:

   A   B 
0  10  [x50, y-1, sss00]
1  20  [x20, MN100, x50, sss00]
2  ...

Expected output:

[x50, y-1, sss00, x20, MN100]

2 Answers 2

2

You can do this simply by list comprehension and sum() method:

result=[x for x in set(df['B'].sum())]

Now If you print result you will get your desired output:

['y-1', 'x20', 'sss00', 'x50', 'MN100']
Sign up to request clarification or add additional context in comments.

2 Comments

Don't use sum to concatenate lists. It looks fancy but it's quadratic and should be considered bad practice.
ohh....ok...btw thnx @jezrael for telling this :)
0

If in input data are not lists, but strings first create lists:

df.B = df.B.str.strip('[]').str.split(',')

Or:

import ast
df.B = df.B.apply(ast.literal_eval)

Use Series.explode for one Series from lists with Series.unique for remove duplicates if order is important:

L = df.B.explode().unique().tolist()
#alternative
#L = df.B.explode().drop_duplicates().tolist()

print (L)
['x50', 'y-1', 'sss00', 'x20', 'MN100']

Another idea if order is not important use set comprehension with flatten lists:

L = list(set([y for x in df.B for y in x]))
print (L)
['x50', 'MN100', 'x20', 'sss00', 'y-1']

2 Comments

Thanks jezrael, could you please briefly explain the logic?
@nilsinelabore - Added to answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.