0

I have this DataFrame and want only the records whose "Total" column is not NaN ,and records when A~E has more than two NaN:

A  B  C  D      E  Total
1  1  3  5      5    8
1  4  3  5      5   NaN
3  6  NaN NaN  NaN   6
2  2  5  9     NaN   8

..i.e. something like df.dropna(....) to get this resulting dataframe:

A  B  C  D      E  Total
1  1  3  5      5    8
2  2  5  9     NaN   8

Here's my code

import pandas as pd

dfInputData = pd.read_csv(path)
dfInputData = dfInputData.dropna(axis=1,how = 'any')
RowCnt = dfInputData.shape[0]

But it looks like no modification has been made even error

Please help!! Thanks

1 Answer 1

1

Use boolean indexing with count all columns without Total for number of missing values and not misisng values in Total:

df = df[df.drop('Total', axis=1).isna().sum(axis=1).le(2) & df['Total'].notna()]
print (df)
   A  B    C    D    E  Total
0  1  1  3.0  5.0  5.0    8.0
3  2  2  5.0  9.0  NaN    8.0

Or filter columns between A:E:

df = df[df.loc[:, 'A':'E'].isna().sum(axis=1).le(2) & df['Total'].notna()]
print (df)
   A  B    C    D    E  Total
0  1  1  3.0  5.0  5.0    8.0
3  2  2  5.0  9.0  NaN    8.0
Sign up to request clarification or add additional context in comments.

2 Comments

but I can't output the DataFrame ,it can only display the previous column of Nan
@tongtong - can you be more specific? Can you change sample data for see problem?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.