3

In Python numpy.unique can remove all duplicates from a 1D array, very efficiently.

1) How about to remove duplicate rows or columns in a 2D array?

2) How about for nD arrays?

2
  • can you illustrate what you are trying to achieve with a simple example. Commented Dec 30, 2012 at 8:49
  • @root One case we may use to remove duplicate points (2D or 3D) from a point cloud. Commented Dec 30, 2012 at 9:07

3 Answers 3

5

If possible I would use pandas.

In [1]: from pandas import *

In [2]: import numpy as np

In [3]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]])

In [4]: DataFrame(a).drop_duplicates().values
Out[4]: 
array([[1, 1],
       [2, 3],
       [5, 4]], dtype=int64)
Sign up to request clarification or add additional context in comments.

2 Comments

pandas is not installed yet. Can you give some benchmarks. BTW, input array to be floats not integers. Try for over 10k points.
Well having pandas installed now, its performance is outstanding: for 30k points (3D) with duplicates 10k total 40k, only 0.2s. wow!
1

The following is another approach which performs much better than for loop. 2s for 10k+100 duplicates.

def tuples(A):
    try: return tuple(tuples(a) for a in A)
    except TypeError: return A

b = set(tuples(a))

The idea inspired by Waleed Khan's first part. So no need for any additional package that is may have further applications. It is also super Pythonic, I guess.

Comments

1

The numpy_indexed package solves this problem for the n-dimensional case. (disclaimer: I am its author). Infact, solving this problem was the motivation for starting this package; but it has grown to include a lot of related functionality.

import numpy_indexed as npi
a = np.random.randint(0, 2, (3, 3, 3))
print(npi.unique(a))
print(npi.unique(a, axis=1))
print(npi.unique(a, axis=2))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.