Eliminate/Remove duplicates from array Matlab

Question

How can I remove any number that has duplicate from an array.

for example:

b =[ 1 1 2 3 3 5 6]

becomes

b =[ 2 5 6]

rahnema1 · Accepted Answer · 2016-11-03 15:36:06Z

1

Use unique function to extract unique values then compute histogram of data for unique values and preserve those that have counts of 1.

a =[ 1 1 2 3 3 5 6];
u = unique(a)
idx = hist(a, u) ==1;
b = u(idx)

result

  2 5 6

for multi column input this can be done:

a = [1 2; 1 2;1 3;2 1; 1 3; 3 5 ; 3 6; 5 9; 6 10] ;
[u ,~, uid] = unique(a,'rows');
idx = hist(uid,1:size(u,1))==1;
b= u(idx,:)

edited Nov 3, 2016 at 15:36

answered Nov 3, 2016 at 3:45

rahnema1

15.9k3 gold badges17 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

hgaarous Over a year ago

Thanks rahnema1, how about if I have two or more columns? A = [1 2; 1 2;1 3;2 1; 1 3; 3 5 ; 3 6; 5 9; 6 10] A = 1 2 1 2 1 3 2 1 1 3 3 5 3 6 5 9 6 10 the result should be like: 2 1 3 5 3 6 5 9 6 10

Sardar Usama Over a year ago

If a is in random order and the same order is to be preserved, we are out of luck here since it gives the sorted answer. :(

rahnema1 Over a year ago

@SardarUsama OK,You are right. You may post your solution as an answer.

Sardar Usama Over a year ago

I haven't thought of a solution yet. I found your answer while searching for a dupe target for this question. One of the answers there are valid for the problem I described.)

rahnema1 Over a year ago

@Thanks! Good suggestion!

m7913d · Accepted Answer · 2017-10-10 13:40:39Z

You can first sort your elements and afterwards remove all elements which have the same value as one of its neighbors as follows:

A_sorted = sort(A); % sort elements
A_diff = diff(A_sorted)~=0; % check if element is the different from the next one 
A_unique = [A_diff true] & [true A_diff]; % check if element is different from previous and next one
A = A_sorted(A_unique); % obtain the unique elements.

Benchmark

I will benchmark my solution with the other provided solutions, i.e.:

using diff (my solution)
using hist (rahnema1)
using sum (Jean Logeart)
using unique (my alternative solution)

I will use two cases:

small problem (yours): A = [1 1 2 3 3 5 6];

larger problem

rng('default');
A= round(rand(1, 1000) * 300);

Result:

                  Small        Large       Comments
----------------|------------|------------%----------------
 using `diff`   | 6.4080e-06 | 6.2228e-05 % Fastest method for large problems
 using `unique` | 6.1228e-05 | 2.1923e-04 % Good performance
 using `sum`    | 5.4352e-06 | 0.0020     % Only fast for small problems, preserves the original order
 using `hist`   | 8.4408e-05 | 1.5691e-04 % Good performance

My solution (using diff) is the fastest method for somewhat larger problems. The solution of Jean Logeart using sum is faster for small problems, but the slowest method for larger problems, while mine is almost equally fast for the small problem.

Conclusion: In general, my proposed solution using diff is the fastest method.

timeit(@() usingDiff(A))
timeit(@() usingUnique(A))
timeit(@() usingSum(A))
timeit(@() usingHist(A))

function A = usingDiff (A)
    A_sorted = sort(A);
    A_unique = [diff(A_sorted)~=0 true] & [true diff(A_sorted)~=0];
    A = A_sorted(A_unique);
end

function A = usingUnique (A)
    [~, ia1] = unique(A, 'first');
    [~, ia2] = unique(A, 'last');
    A = A(ia1(ia1 == ia2));
end

function A = usingSum (A)
    A = A(sum(A==A') == 1);
end

function A = usingHist (A)
    u = unique(A);
    A = u(hist(A, u) ==1);
end

Collectives™ on Stack Overflow

Eliminate/Remove duplicates from array Matlab

2 Answers 2

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related