I have a dataframe similar to below
df = pd.DataFrame.from_dict({'cat1':['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D'], 'cat2':[['X','Y'], ['F'], ['X','Y'], ['Y'], ['Y'], ['Y'], ['Z'], ['P','W'],['L','K'],['L','K'],['L','K']]})
The output is
cat1 cat2
0 A [X, Y]
1 A [F]
2 A [X, Y]
3 B [Y]
4 B [Y]
5 C [Y]
6 C [Z]
7 C [P, W]
8 D [L, K]
9 D [L, K]
10 D [L, K]
I would like to filter out B and D, B and D only has 'Y' and ['L','K'].
Desired output:
cat1 cat2
0 A [X, Y]
1 A [F]
2 A [X, Y]
3 C [Y]
4 C [Z]
5 C [P, W]
I have tried df.groupby(['cat1'])['cat2'].unique()yet, as it is a list column. It will not work.
Thank you in advance