python problem with pandas dataframe of list

Question

I have a Pandas dataframe in which every row is a list.

I want to search a value, but I've got an error. and I know my value exists.

I check this:

df["text list"][1] == ['رهبری']

and got:

True

then i need this:

df[df["text list"] == ['رهبری']]

and got this error:

    ValueError                                Traceback (most recent call last)
    <ipython-input-42-f14f1b2306ec> in <module>
    ----> 1 df[df["text list"] == ['رهبری']]

    ~/.local/lib/python3.6/site-packages/pandas/core/ops/__init__.py in wrapper(self, other, axis)
       1205             # as it will broadcast
       1206             if other.ndim != 0 and len(self) != len(other):
    -> 1207                 raise ValueError("Lengths must match to compare")
       1208 
       1209             res_values = na_op(self.values, np.asarray(other))

    ValueError: Lengths must match to compare

Not sure, but maybe: df[df["text list"] == [['رهبری']]] — Erfan
– Erfan, Commented Jan 7, 2020 at 12:14
df[df['text list'].apply(lamda x: x == ['رهبری'])]? It's all speculating since you should provide a small example dataset where we can reproduce your error with. — Erfan
– Erfan, Commented Jan 7, 2020 at 12:18
I reproduced the error with this minimal frame: ``` test_frame =pd.DataFrame(data = {'test list': [['entry1'], ['e1', 'e2']], 'column2': [1, 2]}) test_frame['test list'][0] == ['entry1'] >>> True test_frame[test_frame['test list'] == ['entry1']] >>> error ``` — Robert King
– Robert King, Commented Jan 7, 2020 at 12:20

Erfan · Accepted Answer · 2020-01-07 12:39:25Z

2

When you pass the list directly to your DataFrame for comparison, it expects an array with the same size to make an element wise comparison.

To avoid this, we can use apply to check on each row if the list is present:

# example dataframe
>>> df = pd.DataFrame({'text list':[['aaa'], ['bbb'], ['ccc']]})
>>> df
  text list
0     [aaa]
1     [bbb]
2     [ccc]

Use Series.apply to check for [bbb]:

>>> m = df['text list'].apply(lambda x: x == ['bbb'])
>>> df[m]
  text list
1     [bbb]

Since we are using apply which is basically a "loopy" implementation in the background. We can avoid using the overhead of pandas and use list comprehension:

>>> m = [x == ['bbb'] for x in df['text list']]
>>> df[m]
  text list
1     [bbb]

edited Jan 7, 2020 at 12:39

answered Jan 7, 2020 at 12:29

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

python problem with pandas dataframe of list

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related