2

I have a pretty simple pandas DataFrame and I want to select the portion of the DataFrame that has data within a column that contains within it another string

So if this is my DataFrame and I want those columns that contain some within the Loc column how can this be done?

             Loc 
0      'something'  
1      'nothing'  

I tried two things:

df['some' in df['Loc']]
df[df.Loc.contains('some')]

But neither solution works.

3 Answers 3

7

You need to use contains, one of the string accessor methods.

>>> df['Loc'].str.contains('some')
0     True
1    False
Name: Loc, dtype: bool

One would then use boolean indexing on the result to select the relevant rows of your dataframe.

Sign up to request clarification or add additional context in comments.

Comments

3
df[df['Loc'].str.contains('some')]

Comments

1

Using replace , a fun way

df[df.Loc.replace({'some':np.nan},regex=True).isnull()]
Out[787]: 
           Loc
0  'something'

Update since Alex mention it in comment

df[df.Loc.apply(lambda x : 'some' in x)]
Out[801]: 
           Loc
0  'something'

3 Comments

That captures a lot of false positives. Anything that does not convert to a string in the column would become NaN, which then gets captured by your .isnull() selection.
Still the same issue with df['Loc'] = ['something', 'nothing', 55, None]. Try lambda x : True if isinstance(x, (str, unicode)) and 'some' in x else False
@Alexander df[df.Loc.astype(str).apply(lambda x : 'some' in x)], If we are talking about the mixtype

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.