I have the following dataframe:
import pandas as pd
import numpy as np
ds = pd.DataFrame({'z':np.random.binomial(n=1,p=0.5,size=10),
'x':np.random.binomial(n=1,p=0.5,size=10),
'u':np.random.binomial(n=1,p=0.5,size=10),
'y':np.random.binomial(n=1,p=0.5,size=10)})
ds
z x u y
0 0 1 0 0
1 0 1 1 1
2 1 1 1 1
3 0 0 1 1
4 0 0 1 1
5 0 0 0 0
6 1 0 1 1
7 0 1 1 1
8 1 1 0 0
9 0 1 1 1
How do I select rows that have the values (0,1) for variable names specified in a list?
This is what I have thus far:
zs = ['z','x']
tf = ds[ds[zs].values == (0,1)]
tf
Now that prints:
z x u y
0 0 1 0 0
0 0 1 0 0
1 0 1 1 1
1 0 1 1 1
2 1 1 1 1
3 0 0 1 1
4 0 0 1 1
5 0 0 0 0
7 0 1 1 1
7 0 1 1 1
8 1 1 0 0
9 0 1 1 1
9 0 1 1 1
Which shows duplicates and also has incorrect row (row #2 - 1,1,1,1). Any thoughts or ideas? Of course I am assuming there is a pythonic way of doing this without nested loops and brute-forcing it.