1

I have following dataframe

id  pattern1    pattern2    pattern3
 1  a-b-c       a-b--       a-b-c
 2  a-a--       a-b--       a-c--
 3  a-v--       a-m--       a-k--
 4  a-b--       a-n--       a-n-c

I want to filter rows that contains the pattern -- at the end in all the columns. In this case the output would be

 2  a-a--       a-b--       a-c--
 3  a-v--       a-m--       a-k--

So far I can only think of doing something like the following

df[(len(df['pattern1'].str.split('--')[1])==0) & \
   (len(df['pattern2'].str.split('--')[1])==0) & \
   (len(df['pattern3'].str.split('--')[1])==0)]

This doesn't work.Also,I can't write the names of all the columns as tehre are 20 columns. How can I filter rows where all the columns in that row match certain pattern/condition?

1 Answer 1

4

Start with setting "id" as the index, if not yet done.

df = df.set_index('id')

One option to check each string is using applymap calling str.endswith:

df[df.applymap(lambda x: x.endswith('--')).all(1)]

   pattern1 pattern2 pattern3
id                           
2     a-a--    a-b--    a-c--
3     a-v--    a-m--    a-k--

Another option is apply calling pd.Series.str.endswith for each column:

df[df.apply(lambda x: x.str.endswith('--')).all(1)]

   pattern1 pattern2 pattern3
id                           
2     a-a--    a-b--    a-c--
3     a-v--    a-m--    a-k--

Lastly, for performance, you can AND masks inside a list comprehension using logical_and.reduce:

# m = np.logical_and.reduce([df[c].str.endswith('--') for c in df.columns])
m = np.logical_and.reduce([
    [x.endswith('--') for x in df[c]] for c in df.columns])
m
# array([False,  True,  True, False])

df[m]
   pattern1 pattern2 pattern3
id                           
2     a-a--    a-b--    a-c--
3     a-v--    a-m--    a-k--

If there are other columns, but you only want to consider those named "pattern*", you can use filter on the DataFrame:

u = df.filter(like='pattern')

Now repeat the options above using u, for example, the first option will be

df[u.applymap(lambda x: x.endswith('--')).all(1)]

...and so on.

Sign up to request clarification or add additional context in comments.

1 Comment

Why would I suggest loops here? If you're interested, read my writeup at For loops with pandas - When should I care?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.