Given a numpy array, I wish to remove the adjacent duplicate non-zero value and all the zero value. For instance, for an array like that: [0,0,1,1,1,2,2,0,1,3,3,3], I'd like to transform it to: [1,2,1,3]. Do you know how to do it? I just know np.unique(arr) but it would remove all the duplicate value and keep the zero value. Thank you in advance!
-
Possible duplicate of Remove following duplicates in a numpy arrayGeorgy– Georgy2019-07-18 12:52:33 +00:00Commented Jul 18, 2019 at 12:52
4 Answers
You can use the groupby method from itertools combined with list comprehension for this problem:
from itertools import groupby
[k for k,g in groupby(a) if k!=0]
# [1,2,1,3]
Data:
a = [0,0,1,1,1,2,2,0,1,3,3,3]
1 Comment
import numpy as np
a = np.array([0,0,1,1,1,2,2,0,1,3,3,3])
Use integer indexing to choose the non-zero elements
b = a[a.nonzero()]
>>> b
array([1, 1, 1, 2, 2, 1, 3, 3, 3])
>>>
Shift the array to the left and add an element to the end to compare each element with its neighbor. Use zero since you know there aren't any in b.
b1 = np.append(b[1:], 0)
>>> b1
array([1, 1, 2, 2, 1, 3, 3, 3, 0])
>>>
Use boolean indexing to get the values you want.
c = b[b != b1]
>>> c
array([1, 2, 1, 3])
>>>
Comments
>>> import numpy as NP
>>> a = NP.array([0,0,1,1,1,2,2,0,1,3,3,3])
first, remove the zeros:
>>> idx = a==0
>>> a = a[-idx1]
>>> a
array([1, 1, 1, 2, 2, 1, 3, 3, 3])
now remove the consecutive duplicates
note that ediff1d(a) & a have different shapes, hence a1 is not the result; the leading value of a has to be pre-pended to it, as i did in the last three lines below)
>>> idx = NP.array(NP.ediff1d(a), dtype=bool)
>>> a1 = a[1:][idx]
array([2, 1, 3])
create an empty array to store the result
>>> a0 = NP.empty(shape=(a1.shape[0]+1,))
>>> a0[0] = a[0]
>>> a0[1:] = a1
>>> a0
array([ 1, 2, 1, 3])