0

I understand how to filter a dataframe in pandas using a single or two partial strings:

final_df = df[df['Answers'].str.contains("not in","not on")]

I got the help from this link: Select by partial string from a pandas DataFrame

However I am unable to extend the filtering to more than two partial strings.

final_df = df[df['Answers'].str.contains("not in","not on","not have")]

If I try, I get the following error:

TypeError: unsupported operand type(s) for &: 'str' and 'int'

How do I tweak if I have to extend the filtering based on multiple partial strings? Thank You.

1
  • 2
    Use df['Answers'].str.contains("not in|not on|not have") Commented Jul 31, 2019 at 5:48

1 Answer 1

4

Use str.contains with | for multiple search elements:

mask = df['Answers'].str.contains(regex_pattern)
final_df = df[mask]

To create the regex pattern if you have the search elements use:

strings_to_find = ["not in","not on","not have"]
regex_pattern = '|'.join(strings_to_find)
regex_pattern 
'not in|not on|not have'
Sign up to request clarification or add additional context in comments.

1 Comment

it would be great if you add how to get regex with |. using join

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.