0

Suppose the following dataframe:

import pandas as pd

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
        'Height of Person': [5.1, 6.2, 5.1, 5.2],
        'Qualification': ['Msc', 'MA', 'Msc', 'Msc'],
        'Country is': ['US', 'UK', 'GE', 'ET']     
       }
df = pd.DataFrame(data)
display(df)

I would like to specify columns that should remain in the dataframe based on a number of strings that are present in the index.

E.g. Keep those columns whose index contain "Name" or "Country" should result in:

data2 = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
        'Country is': ['US', 'UK', 'GE', 'ET']   
       }
df2 = pd.DataFrame(data2)
display(df2)

I tried using

df = df.filter(like=["Name"])

but I am not sure how to apply multiple expressions (strings) at once.

0

4 Answers 4

2

If you want to filter by name, you can use filter with a regex:

df.filter(regex='Name|Country')
Sign up to request clarification or add additional context in comments.

Comments

0

If you're trying to filter just on columns you can do:

df = df[[x for x in df.columns if x in ['Names', 'Country is']]

Comments

0

This should work:

col_filter = df.columns.str.contains('Name') + df.columns.str.contains('Country')
df.loc[:,col_filter]

Result:

     Name Country is
0     Jai         US
1  Princi         UK
2  Gaurav         GE
3    Anuj         ET

Comments

0

I usually use .loc and find it clearer to read.

df = df.loc[:, df.columns.str.contains('Name|Country', regex=True) 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.