4

I want to check how many numpy array elements inside numpy array are different. The solution should not contain list comprehension. Something along these lines (note that a and b differ in the last array):

a = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]] )
b = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,0,0]] )
y = diff_count( a,b )
print y

>> 1
1
  • 2
    Why 1? Two Elements are different. Commented Apr 12, 2018 at 8:04

4 Answers 4

2

Approach #1

Perform element-wise comparison for non-equality and then get ANY reduction along last axis and finally count -

(a!=b).any(-1).sum()

Approach #2

Probably faster one with np.count_nonzero for counting booleans -

np.count_nonzero((a!=b).any(-1))

Approach #3

Much faster one with views -

# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
    a = np.ascontiguousarray(a)
    b = np.ascontiguousarray(b)
    void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
    return a.view(void_dt).ravel(),  b.view(void_dt).ravel()

a1D,b1D = view1D(a,b)
out = np.count_nonzero(a1D!=b1D)

Benchmarking

In [32]: np.random.seed(0)
    ...: m,n = 10000,100
    ...: a = np.random.randint(0,9,(m,n))
    ...: b = a.copy()
    ...: 
    ...: # Let's set 10% of rows as different ones
    ...: b[np.random.choice(len(a), len(a)//10, replace=0)] = 0

In [33]: %timeit (a!=b).any(-1).sum() # app#1 from this soln
    ...: %timeit np.count_nonzero((a!=b).any(-1)) # app#2
    ...: %timeit np.any(a - b, axis=1).sum() # @Graipher's soln
1000 loops, best of 3: 1.14 ms per loop
1000 loops, best of 3: 1.08 ms per loop
100 loops, best of 3: 2.33 ms per loop

In [34]: %%timeit  # app#3
    ...: a1D,b1D = view1D(a,b)
    ...: out = np.count_nonzero((a1D!=b1D).any(-1))
1000 loops, best of 3: 797 µs per loop
Sign up to request clarification or add additional context in comments.

Comments

1

You can try it using np.ravel(). If you want element wise comparison.

(a.ravel()!=b.ravel()).sum()
(a-b).any(axis=0).sum()

above lines gives 2 as output.

If you want row wise comparison, you can use.

(a-b).any(axis=1).sum()

This gives 1 as output.

1 Comment

Outputs 2. Think OP wants to compare on per row basis and not element-wise.
0

You can use numpy.any for this:

y = np.any(a - b, axis=1).sum()

Comments

0

Would this work?

y=sum(a[i]!=b[i]for i in range len(a))

Sorry that I can’t test this myself right now.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.