1

I've been trying to take a pandas.Dataframe and drop its rows and columns with missing values simultaneously. While trying to use dropna and applying on both axis, I found out that this is no longer supported. So then I tried, using dropna, to drop the columns and then drop the rows and vice versa and obviously, the results come out different as the values no longer reflect the initial state accurately. So to give an example I receive:

pandas.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],
                   "toy": [numpy.nan, 'Batmobile', 'Bullwhip'],
                   "weapon": [numpy.nan, 'Boomerang', 'Gun']})

and return:

pandas.DataFrame({"name": ['Batman', 'Catwoman']})

Any help will be appreciated.

1 Answer 1

1

Test if all values per columns and for rows use DataFrame.notna with DataFrame.any and DataFrame.loc:

m = df.notna()
df0 = df.loc[m.all(1), m.all()]
print (df0)
      name
1    Batman
2  Catwoman
Sign up to request clarification or add additional context in comments.

2 Comments

How would you apply this solution if it's needed to drop columns and rows only where all their values are missing?
@EliranZeitouni - simpliest is df = df.dropna(how='all', axis=0) or df = df.dropna(how='all', axis=1)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.