using a regex pattern to filter rows from a pandas dataframe

Question

Suppose I have a pandas dataframe like this:

         Word      Ratings
   0     TLYSFFPK  1
   1     SVLENFVGR 2
   2     SVFNHAIRK 3
   3     KAGEVFIHK 4

How can I use regex in pandas to filter out the rows that have the word that match the following regex pattern but keep the dataframe formatting? The regex pattern is: \b.[VIFY][MLFYIA]\w+[LIYVF].[KR]\b

Expected output:

         Word    Ratings
   1     SVLENFVGR 2
   2     SVFNHAIRK 3

MaxU - stand with Ukraine · Accepted Answer · 2017-08-03 18:47:53Z

12

Demo:

In [2]: df
Out[2]:
        Word  Ratings
0   TLYSFFPK        1
1  SVLENFVGR        2
2  SVFNHAIRH        3
3  KAGEVFIHK        4

In [3]: pat = r'\b.[VIFY][MLFYIA]\w+[LIYVF].[KR]\b'

In [4]: df.Word.str.contains(pat)
Out[4]:
0    False
1     True
2    False
3    False
Name: Word, dtype: bool

In [5]: df[df.Word.str.contains(pat)]
Out[5]:
        Word  Ratings
1  SVLENFVGR        2

answered Aug 3, 2017 at 18:47

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Tharunkumar Reddy Over a year ago

You are always a time saver for me :)

Collectives™ on Stack Overflow

using a regex pattern to filter rows from a pandas dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related