Pandas: compare list objects in Series

Question

In my dataframe a column is made up of lists, for example:

df = pd.DataFrame({'A':[[1,2],[2,4],[3,1]]})

I need to find out the location of list [1,2] in this dataframe. I tried:

df.loc[df['A'] == [1,2]]

and

df.loc[df['A'] == [[1,2]]]

but failed totally. The comparison seems very simple but that just doesn't work. Am I missing something here?

The only thing you're "missing" is that data frames aren't really great for storing lists. Any reason you don't want two separate columns? — BallpointBen
– BallpointBen, Commented Nov 1, 2018 at 21:18
@BallpointBen Thanks for your attention, I've posted a new question to explain the whole question. stackoverflow.com/questions/53115592/… — Shiang Hoo
– Shiang Hoo, Commented Nov 2, 2018 at 9:11
@Luuklag This may be a duplicate, but I don't believe it's a duplicate of the target you suggest. That one seems to be trying to filter based on whether multiple columns are equal to particular values. This one is trying to check if the list is equal to a single column's value, which has a very different answer. — jpmc26
– jpmc26, Commented Nov 13, 2018 at 22:19
@Luuklag, I posted the two questions because I don't think they are the same. As jpmc described, they are connected but also very different. This post is actually the varietas of that one: I tried stupid things to solve that one and based on the stupid thing I posted this one. But this one still has its distinct value. Can you please remove the duplicate target? — Shiang Hoo
– Shiang Hoo, Commented Nov 19, 2018 at 2:56

anothermh · Accepted Answer · 2018-11-25 19:42:02Z

20

Do not use list in cell, it creates a lot of problem for pandas. If you do need an object column, using tuple:

df.A.map(tuple).isin([(1,2)])
Out[293]: 
0     True
1    False
2    False
Name: A, dtype: bool
#df[df.A.map(tuple).isin([(1,2)])]

edited Nov 25, 2018 at 19:42

anothermh

10.7k3 gold badges42 silver badges61 bronze badges

answered Nov 1, 2018 at 13:56

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Frank_Coumans Over a year ago

Could you explain why a tuple is better? I've noticed that Pandas struggles when lists are in a cell, but does it handle tuples better because they are immutable? Would you expect a numpy array to work better than a list?

Space Impact · Accepted Answer · 2018-11-01 13:56:29Z

15

You can use apply and compare as:

df['A'].apply(lambda x: x==[1,2])

0     True
1    False
2    False
Name: A, dtype: bool

print(df[df['A'].apply(lambda x: x==[1,2])])

        A
0  [1, 2]

answered Nov 1, 2018 at 13:56

Space Impact

13.3k26 silver badges51 bronze badges

Comments

piRSquared · Accepted Answer · 2018-11-01 14:34:09Z

9

With Numpy arrays

df.assign(B=(np.array(df.A.tolist()) == [1, 2]).all(1))

        A      B
0  [1, 2]   True
1  [2, 4]  False
2  [3, 1]  False

answered Nov 1, 2018 at 14:34

piRSquared

296k68 gold badges509 silver badges654 bronze badges

4 Comments

jpp Over a year ago

This should be the accepted solution! [Or, if possible, just expanding the series of lists to 2 series.]

ALollz Over a year ago

Won't this run into issues if the lists are differently sized, though perhaps that's outside of the scope of this example.

piRSquared Over a year ago

@ALollz yes and yes

Shiang Hoo Over a year ago

Nice! My only concern is, this solution converts datatype twice, what if my dataframe is very big, will this conversion cost more time?

Vaishali · Accepted Answer · 2018-11-21 00:28:04Z

6

Using numpy

df.A.apply(lambda x: (np.array(x) == np.array([1,2])).all())

0     True
1    False
2    False

edited Nov 21, 2018 at 0:28

answered Nov 1, 2018 at 14:32

Vaishali

38.5k5 gold badges62 silver badges88 bronze badges

Comments

U13-Forward · Accepted Answer · 2018-11-09 04:18:03Z

0

Or:

df['A'].apply(([1,2]).__eq__)

Then:

df[df['A'].apply(([1,2]).__eq__)]

answered Nov 9, 2018 at 4:18

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Collectives™ on Stack Overflow

Pandas: compare list objects in Series

5 Answers 5

1 Comment

Comments

With Numpy arrays

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

With Numpy arrays

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related