2

I have pandas dataframe df. I would like to select columns which have standard deviation grater than 1. Here is what I tried

df2 = df[df.std() >1]
df2 = df.loc[df.std() >1] 

Both generated error. What am I doing wrong?

2
  • 4
    You're trying to select from the row index, not the columns. Use instead: df.loc[:, df.std() > 1] Commented Aug 17, 2015 at 17:20
  • @ajcr Thank you very much, you answered my question. Commented Aug 17, 2015 at 17:30

2 Answers 2

1

We need to get the list of columns whose values have standard deviation greater than 1.

That list of columns can then be passed to the dataframe to select the relevant data.

Be mindful to remove the columns of type "object" before trying to get the list. Below line get the list of columns.

df.columns[(df.std() > 1).to_list()]

Below line to get the dataframe with the selected columns.

df[df.columns[(df.std() > 1).to_list()]]
Sign up to request clarification or add additional context in comments.

Comments

1

Use df.loc[:, df.std() > 1] and it will fix it.

The first part which is [: refers to the rows and the second part df.std() > 1 refers to the columns

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.