2

How do I remove duplicates when a numpy array field has duplicates.

for example, i have an array like this:

vals = numpy.array([[1,2,3],[1,5,6],[1,8,7],[0,4,5],[2,2,1],[0,0,0],[5,4,3]])

array([[1, 2, 3],
       [1, 5, 6],
       [1, 8, 7],
       [0, 4, 5],
       [2, 2, 1],
       [0, 0, 0],
       [5, 4, 3]])

i need to remove the duplicates for field [0], so that i got the results like:

([1,2,3],
[0, 4, 5],
[2, 2, 1],
[0, 0, 0],
[5, 4, 3]])
2
  • Does this have to be done as an array operation or would you accept it being performed using lists. The latter is much easier to implement and for small arrays the performance difference will be negligible. Commented Mar 8, 2016 at 17:58
  • 6
    Should the row [0, 0, 0] be removed too? Commented Mar 8, 2016 at 18:13

1 Answer 1

8

You can use numpy.unique:

In [11]: vals
Out[11]: 
array([[1, 2, 3],
       [1, 5, 6],
       [1, 8, 7],
       [0, 4, 5],
       [2, 2, 1],
       [0, 0, 0],
       [5, 4, 3]])

In [12]: unique_keys, indices = np.unique(vals[:,0], return_index=True)

In [13]: vals[indices]
Out[13]: 
array([[0, 4, 5],
       [1, 2, 3],
       [2, 2, 1],
       [5, 4, 3]])

To maintain the original order:

In [17]: vals[np.sort(indices)]
Out[17]: 
array([[1, 2, 3],
       [0, 4, 5],
       [2, 2, 1],
       [5, 4, 3]])
Sign up to request clarification or add additional context in comments.

3 Comments

if there are two column have duplicates. how should the code be changed then? such as based on the first column and second column, if first column and second column both duplicates, then remove one duplicates
i actually just tried it on my numpy array. It gave me error saying too much indices for an array. My array have like a million records
Re: two columns... Please edit the question to include all the relevant details. Re: error... Edit the question to include a simple example that demonstrates the error. Include a copy of the complete traceback (i.e. the full error message) in the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.