0

I am trying to get an array where it would remove all unique rows based on the first column. My array works looks like this,

[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Orange' 'Orange']
 ['Pear' 'Yellow']
 ['Pear' '0.0']
 ['Strawberry' 'Red']]

I want it to look like this,

[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Pear' 'Yellow']
 ['Pear' '0.0']]

Where it would remove the unique values from column one. My current code looks like this,

arr = np.array(["Aaple", "Pear", "Banana"])

arr2 = np.array([["Strawberry", "Red"], ["Aaple", "Red"], ["Orange", "Orange"], ["Pear", "Yellow"], ["Banana", "Yellow"]])


arr = arr.reshape(-1,1)
zero_arr = np.zeros((len(arr), 1))
arr = np.column_stack((arr, zero_arr))
combine = np.vstack((arr2, arr))
sort = combine[combine[:,0].argsort()]
#Where the first array printed is sort

I was able to get ['Aaple' 'Banana' 'Pear'], the rows I want to keep by adding x = sort[:-1][sort[1:] == sort[:-1]], what would be the next steps?

1 Answer 1

1

It may be easier to use pandas:

df = pd.DataFrame(sort, columns=list('ab'))
df[df.groupby('a').a.transform('count')>1].values

Result:

array([['Aaple', 'Red'],
       ['Aaple', '0.0'],
       ['Banana', 'Yellow'],
       ['Banana', '0.0'],
       ['Pear', 'Yellow'],
       ['Pear', '0.0']], dtype=object)
Sign up to request clarification or add additional context in comments.

2 Comments

I am trying to do this all in Numpy, I have a large dataset and using pandas takes too long, is it not possible using Numpy? @stef
@AndrewHorowitz it's probably possible doing it in pure numpy as pandas uses numpy under the hood, but I don't think that it'll so much faster than pandas as the involved operations here are being carried out on numpy arrays (maybe I'm wrong)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.