Find matching values in a list-type column in Python/Pandas [duplicate]

Question

So I have a function that sets a value in a column of a dataframe based on whether or not some string in the dataframe contains values from a list. I then want to get a count of how many rows in the dataframe have that value, but I am getting an error.

If certain conditions are met, the 'tag' column is being set equal to a list, ['date','must',glucose']. Not all of the rows meet the condition for this to happen. I want to find the number of rows where this IS being met,by analyzing the dataframe.

I have tried this:

df = data[data['tag'] == ['date','must','glucose']]
print(df)

...but that yields:

ValueError: Lengths must match to compare

I also tried this but that yields the same error:

df = data.tag == ['date','must','glucose']

If I was just comparing values, that would work, but having a list in the cell instead of a value is blowing it up. Like if the value was just 'four' and I was doing this, it wouldn't give me an error:

df = data[data.tag=='four']

Is there a way to accomplish this? Thank you!

can you paste a sample of data?

vb_rises
– vb_rises

2019-09-10 17:40:25 +00:00
Commented Sep 10, 2019 at 17:40 — vb_rises
– vb_rises, Commented Sep 10, 2019 at 17:40

vb_rises · Accepted Answer · 2019-09-10 17:52:03Z

2

You can use apply function for it.  

df = df[df['tag'].apply(lambda x : x == ['date','must','glucose'])]

you can also convert it into tuple and compare

source: Pandas: compare list objects in Series

answered Sep 10, 2019 at 17:52

vb_rises

1,9071 gold badge11 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Korzak Over a year ago

This worked perfectly, thank you! Nice and simple, and replicable for similar situations. Much appreciated.

Joe · Accepted Answer · 2019-09-11 10:42:16Z

0

EDITING ANSWER

You need to use isin() to accomplish that. Consider:

>>> data = pd.DataFrame({'sample col1': [1,2,3,4,5], 'sample col2': ['a','b','c','d','e'], 'tag': ['some text', 'some value','date','must','glucose']})

>>> data
   sample col1 sample col2         tag
0            1           a   some text
1            2           b  some value
2            3           c        date
3            4           d        must
4            5           e     glucose
>>> df = data[~data['tag'].isin(['date','must','glucose'])]
>>> df
   sample col1 sample col2         tag
0            1           a   some text
1            2           b  some value

On your case:

>>> df.reset_index(inplace = True, drop =True)
>>> df['map'] = 'True'

>>> df
   sample col1 sample col2         tag   map
0            1           a   some text  True
1            2           b  some value  True

>>> map_dict = dict(zip(df['tag'], df['map']))
>>> data['In your list?'] = data['tag'].map(map_dict).fillna(value = 'False')

>>> data
   sample col1 sample col2         tag Not in your list?
0            1           a   some text              True
1            2           b  some value              True
2            3           c        date             False
3            4           d        must             False
4            5           e     glucose             False

Hope this helps :D

edited Sep 11, 2019 at 10:42

answered Sep 10, 2019 at 18:13

Joe

8892 gold badges6 silver badges15 bronze badges

2 Comments

Korzak Over a year ago

Thanks for this, but I think there is a misunderstanding. In the 'tag' column, for some rows, the value is ['date','must','glucose']. I am looking to find those rows. The answer provided from vbrises worked for this purpose.

Joe Over a year ago

I see you want not in logic. Just simply place ~ or tilde sign. I'll update my answer.

Collectives™ on Stack Overflow

Find matching values in a list-type column in Python/Pandas [duplicate]

2 Answers 2

1 Comment

EDITING ANSWER

On your case:

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

EDITING ANSWER

On your case:

2 Comments

Linked

Related