So I have two arrays that look like below:
x1 = np.array([['a','b','c'],['d','a','b'],['c','a,c','c']])
x2 = np.array(['d','c','d'])
I want to check if each element in x2 exists in a corresponding column in x1. So I tried:
print((x1==x2).any(axis=0))
#array([ True, False, False])
Note that x2[1] in x1[2,1] == True. The problem is, sometimes an element we're looking for is inside an element in x1 (where it can be identified if we split by comma). So my desired output is:
array([ True, True, False])
Is there a way to do it using a numpy (or pandas) native method?
==work? Finding entries containing a substring in a numpy array?. Like(np.core.defchararray.find(x1, x2) != -1).any(axis=0)Or does the comma need to be split into separate elements that need tested separately?'a,c'Is that a typo, of do you really want to consider that as two different characters? Because I would say neither'a'nor'c'exists in that column and you should try to clean your data up first. Also, why is your desired result for the third columnFalse— it contains'c', which is inx2.'a,c'in an array to represent the two separate characters'a'and'c'. I would suggest to have them as separate items in the array. If you run into array shape issues you could fill up the smaller arrays withnans