I have a dataframe, need to filter out a list of elements in the first column, for which in second column there are both - Null and non-null values.
["1"] ["2"]
"A" "Smthng"
"B" "sometext"
"C" NULL
"A" NULL
For this case I should get A:
["1"] ["2"]
"A" "Smthng"
"A" NULL
I did this, and it's working. But maybe you know how to do it faster, in one-line code.
What I have done:
NamesWithMissing = df[df['2'].isna()]['1'].tolist()
NamesWithMissing = df[(df['1'].isin(NamesWithMissing)) & (df['2'].notnull())]['1'].tolist()
df[df['1'].isin(NamesWithMissing)].sort_values(by="1")
UPD
Found interesting solution:
df.groupby('1').filter(lambda g: (g.nunique() > 1).any())