I know how to drop a row from a DataFrame containing all nulls OR a single null but can you drop a row based on the nulls for a specified set of columns?
For example, say I am working with data containing geographical info (city, latitude, and longitude) in addition to numerous other fields. I want to keep the rows that at a minimum contain a value for city OR for lat and long but drop rows that have null values for all three.
I am having trouble finding functionality for this in pandas documentation. Any guidance would be appreciated.
dropna()will work incorrectly in this case. Check a row with index4in my example.df.dropna(subset=['city','latitude','longitude'], how='all')will drop it...df.dropna(axis=0, subset=[['city', 'longitude', 'latitude']], thresh=2)but in general, you're right, explicit logical statements for what is desired are superior to thedropnasolution