Dropping Pandas Dataframe Columns based on column name

Question

I have a Pandas dataframe df with extraneous information. The extraneous information is stored in the columnns that have names containing "PM". I would like to remove these columns but I'm not sure how to. Below is my attempt to do this. However, I received this error message: AttributeError: 'numpy.float64' object has no attribute 'PM'. I'm not sure how to interpret this error message. I also don't understand why numpy is mentioned in the message since the dataframe df is a pandas object.

for j in range(0,len(df.columns)-1):
 df.iloc[0,j].str.contains("PM"):
   df.drop(j, axis=1)

AttributeError: 'numpy.float64' object has no attribute 'PM'

Can you add some small data sample, 3 columns, 3 rows?

jezrael
– jezrael

2021-10-19 05:53:54 +00:00
Commented Oct 19, 2021 at 5:53 — jezrael
– jezrael, Commented Oct 19, 2021 at 5:53

EBDS · Accepted Answer · 2021-10-19 06:04:18Z

1

Using an empty dataframe

df = pd.DataFrame(columns=['a','b','ABCPMYXZ','QWEPMQWE','c','d'])
df
df = df[[i for i in df.columns if not 'PM' in i]]
df

answered Oct 19, 2021 at 6:04

EBDS

1,8244 gold badges13 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Linus Unnebäck · Accepted Answer · 2021-10-19 15:09:12Z

1

Based on what I understood, you want to delete columns, so initially you should store all the column names in a list. Next remove all the elements from the list which doesn't contain PM in it.

columns = list(df.columns.values)
columns = [col for col in columns if 'PM' in col]
df.drop(columns=columns, axis=1, inplace=True)

edited Oct 19, 2021 at 15:09

Linus Unnebäck

24.4k16 gold badges79 silver badges91 bronze badges

answered Oct 19, 2021 at 6:05

Himanshu Pingulkar

3362 silver badges8 bronze badges

2 Comments

joanis Over a year ago

Careful removing elements of a list while iterating over it, it often leads to incorrect results, in particular when two consecutive elements should be removed.

Himanshu Pingulkar Over a year ago

Yes, thanks for pointing it out.

U13-Forward · Accepted Answer · 2021-10-19 06:26:49Z

0

Use regex with filter:

df.filter(regex='^((?!PM).)*$')

This is the shortest solution here.

edited Oct 19, 2021 at 6:26

answered Oct 19, 2021 at 6:06

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

1 Comment

U13-Forward Over a year ago

@jezrael No time. Edited it out

Collectives™ on Stack Overflow

Dropping Pandas Dataframe Columns based on column name

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related