Filtering data frame rows with regex

Question

Here is my data frame:

import pandas as pd


data = {'Period':['Group 1 vs Group 2:Change at 3 mo', 'Group 1:Change at 3 mo', 'Group 1 vs Group 2:Change at 3 mo', 'Group 2:Change at 3 mo'], 'estimate':[20, 21, 19, 18]}

df = pd.DataFrame(data)

Now I need to get only rows that in variable Period do not contain anywhere Group 1 vs Group 2. I tried this code:

df = df.loc[df['Period'].str.contains(pat = '(?!Group 1 vs Group 2)', regex = True)].reset_index(drop=True)

But it does not filters rows and I am getting original df as a result. How to fix my code so I will get only rows that in variable Period do not contain anywhere Group 1 vs Group 2?

df[~df['Period'].str.contains(r'Group 1 vs Group 2')]

Wiktor Stribiżew
– Wiktor Stribiżew

2020-07-16 16:02:48 +00:00
Commented Jul 16, 2020 at 16:02 — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Jul 16, 2020 at 16:02

BENY · Accepted Answer · 2020-07-16 15:57:20Z

2

You can try str.match

df[~df.Period.str.match('Group 1 vs Group 2')]
Out[85]: 
                   Period  estimate
1  Group 1:Change at 3 mo        21
3  Group 2:Change at 3 mo        18

answered Jul 16, 2020 at 15:57

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Filtering data frame rows with regex

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related