Keep only columns in Pandas Dataframe based on multiple conditions

Question

Suppose the following dataframe:

import pandas as pd

data = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
        'Height of Person': [5.1, 6.2, 5.1, 5.2],
        'Qualification': ['Msc', 'MA', 'Msc', 'Msc'],
        'Country is': ['US', 'UK', 'GE', 'ET']     
       }
df = pd.DataFrame(data)
display(df)

I would like to specify columns that should remain in the dataframe based on a number of strings that are present in the index.

E.g. Keep those columns whose index contain "Name" or "Country" should result in:

data2 = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
        'Country is': ['US', 'UK', 'GE', 'ET']   
       }
df2 = pd.DataFrame(data2)
display(df2)

I tried using

df = df.filter(like=["Name"])

but I am not sure how to apply multiple expressions (strings) at once.

mozway · Accepted Answer · 2021-09-01 14:52:35Z

2

If you want to filter by name, you can use filter with a regex:

df.filter(regex='Name|Country')

answered Sep 1, 2021 at 14:52

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Alex F · Accepted Answer · 2021-09-01 14:49:56Z

0

If you're trying to filter just on columns you can do:

df = df[[x for x in df.columns if x in ['Names', 'Country is']]

answered Sep 1, 2021 at 14:49

Alex F

2,2745 gold badges42 silver badges79 bronze badges

Comments

René · Accepted Answer · 2021-09-01 15:05:13Z

0

This should work:

col_filter = df.columns.str.contains('Name') + df.columns.str.contains('Country')
df.loc[:,col_filter]

Result:

     Name Country is
0     Jai         US
1  Princi         UK
2  Gaurav         GE
3    Anuj         ET

answered Sep 1, 2021 at 15:05

René

4,9195 gold badges29 silver badges59 bronze badges

Comments

mozway · Accepted Answer · 2021-09-01 15:15:03Z

0

I usually use .loc and find it clearer to read.

df = df.loc[:, df.columns.str.contains('Name|Country', regex=True)

edited Sep 1, 2021 at 15:15

mozway

267k13 gold badges56 silver badges106 bronze badges

answered Sep 1, 2021 at 14:56

Metro

12 bronze badges

Collectives™ on Stack Overflow

Keep only columns in Pandas Dataframe based on multiple conditions

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related