String Indexing in dataframe subset - pandas

Question

I'm trying to create a subset of a pandas dataframe, based on values in a list. However, I need to include string indexing. I'll demonstrate with an example:

Here is my dataframe:

df = pd.DataFrame({'A' : ['1-2', '2', '3', '3-8', '4']})

Here is what it looks like:

I have a list of values I want to use to select rows from my dataframe.

list1 = ['2', '3']

I can use the .isin() function to select rows from my dataframe using my list items.

subset = df[df['A'].isin(list1)]
print(subset)

   A
1  2
2  3

However, I want any value that includes '2' or '3'. This is my desired output:

Can I use string indexing in my .isin() function? I am struggling to come up with another workaround.

BENY · Accepted Answer · 2019-10-29 19:07:48Z

3

Check str.split with isin and any

Newdf=df[df.A.str.split('-',expand=True).isin(['2','3']).any(1)].copy()
Out[189]: 
     A
0  1-2
1    2
2    3
3  3-8

answered Oct 29, 2019 at 19:07

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Erich Purpur Over a year ago

what does .any() do? More specifically, the argument (1) in .any(1).

BENY Over a year ago

any True per row @ErichPurpur

Georgina Skibinski · Accepted Answer · 2019-10-29 19:18:56Z

1

You can try with regular expression:

import re

pattern=re.compile(".*(("+(")|(").join(list1)+"))")

print(df.loc[df['A'].apply(lambda x: True if pattern.match(x) else False)])

Output:

A
0  1-2
1    2
2    3
3  3-8

[Program finished]

answered Oct 29, 2019 at 19:18

Georgina Skibinski

13.5k2 gold badges16 silver badges44 bronze badges

Collectives™ on Stack Overflow

String Indexing in dataframe subset - pandas

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related