I have a dataframe in which the columns are supposed to be dummy columns (for each row only one column should be populated). However, the data has some 'noise' in it: some rows have more than one column populated. I want to drop these rows.
Suppose the DataFrame looks like the below example:
a b c d
0 NaN 1 NaN NaN
1 1 2 3 4
2 1 1 NaN NaN
3 NaN NaN 1 NaN
4 1 NaN 1 NaN
So my expected result is that rows [1,2,4] get dropped. You may say that I only accept rows where the number of NaN values is equal to the number_of_columns - 1.
Is there any way to do this in pandas?