Python: Comparing all elements of two arrays and modifying 2nd array

Question

New to Python, and have been learning about arrays. I am stuck with a simple enough problem and need a solution. I have two arrays:

a = [2.0, 5.1, 6.2, 7.9, 23.0]     # always increasing
b = [5.1, 5.5, 5.7, 6.2, 00.0]     # also always increasing

and I want the resultant array to be:

c = [0.0, 5.1, 6.2, 0.0, 0.0]      # 5.5, 5.7, 00.0 from 'b' were dropped and rearranged such that position of equivalent elements as in 'a' are maintained

I have compared both 'a' & 'b' using Numpy as in:

y = np.isclose(a, b)
print y
# [False False False False False]

(Alternately,) I also tried something like this, which isn't the right way (I think):

c = np.zeros(len(a))
for i in range (len(a)):
    for j in range (len(a)):
        err = abs(a[i]-b[j])
        if err == 0.0 or err < abs(1):
            print (err, a[i], b[j], i, j)
        else:
            print (err, a[i], b[j], i, j)

How do I proceed from here towards obtaining 'c'?

It isn't affecting the result. Putting atol=0.5 gives [False True True False False] which is c bool-wise. — rNov
– rNov, Commented Feb 28, 2016 at 12:47
Do I simply copy element values from a at position of True values? Or is there a better way of doing it? — rNov
– rNov, Commented Feb 28, 2016 at 12:49
Do equivalent values have to be at equal indices? Or would a=[5,6,7]; b=[0,0,5] give c=[5,0,0]? (Your comment in your 2nd code snippet is not clear to me.) — Norman
– Norman, Commented Feb 28, 2016 at 13:22
Yes equal values at equal indices, but since the values are in ascending order (always increasing), sorting will be easier. 2nd snippet is an alternate way. — rNov
– rNov, Commented Feb 28, 2016 at 13:30

hruske · Accepted Answer · 2016-02-28 21:25:03Z

4

These solutions work even when the arrays are of different size.

Simple version

c = []

for i in a:
    if any(np.isclose(i, b)):
        c.append(i)
    else:
        c.append(0.0)

Numpy version

aa = np.tile(a, (len(b), 1))
bb = np.tile(b, (len(a), 1))
cc = np.isclose(aa, bb.T)
np.any(cc, 0)
c = np.zeros(shape=a.shape)
result = np.where(np.any(cc, 0), a, c)

Explained:

I will be doing matrix comparison here. First you expand the arrays into matrices. Lengths are exchanged, which creates matrices having equal size of one dimension:

aa = np.tile(a, (len(b), 1))
bb = np.tile(b, (len(a), 1))

They look like this:

# aa
array([[  2. ,   5.1,   6.2,   7.9,  23. ],
       [  2. ,   5.1,   6.2,   7.9,  23. ],
       [  2. ,   5.1,   6.2,   7.9,  23. ],
       [  2. ,   5.1,   6.2,   7.9,  23. ],
       [  2. ,   5.1,   6.2,   7.9,  23. ]])

# bb
array([[ 5.1,  5.5,  5.7,  6.2,  0. ],
       [ 5.1,  5.5,  5.7,  6.2,  0. ],
       [ 5.1,  5.5,  5.7,  6.2,  0. ],
       [ 5.1,  5.5,  5.7,  6.2,  0. ],
       [ 5.1,  5.5,  5.7,  6.2,  0. ]])

Then compare them. Note that bb is transposed:

cc = np.isclose(aa, bb.T)

And you get:

array([[False,  True, False, False, False],
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False, False,  True, False, False],
       [False, False, False, False, False]], dtype=bool)

You can aggregate this by axis 0:

np.any(cc, 0)

which returns

array([False,  True,  True, False, False], dtype=bool)

Now create array c:

c = np.zeros(shape=a.shape)

And select appropriate value, either from a or c:

np.where(np.any(cc, 0), a, c)

And the result:

array([ 0. ,  5.1,  6.2,  0. ,  0. ])

edited Feb 28, 2016 at 21:25

answered Feb 28, 2016 at 13:30

hruske

2,24319 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Norman Over a year ago

I think this is what the OP wants. But he added a comment now saying that there may be trailing junk values at the end of the b array which must be ignored. So one would need to trim that array before using it, or adapt the algorithm. (Applies to all current answers, I think.)

hruske Over a year ago

Yes, I've added a comment about the assumption. The remainder can be np.appended if necessary.

rNov Over a year ago

Very thorough explanation I must say. Only thing is, if one used numpy to generate an empty matrix of zeroes (say) c, won't np.append throw errors, instead of using c[i] = b[j].

hruske Over a year ago

I've edited a fix for different array sizes (just exchanged len(a) and len(b)). Basically - all that you now have to do is decide where bad data starts and remove it so it will not interfere with comparisons.

MSeifert · Accepted Answer · 2016-02-28 14:52:29Z

1

With np.isclose you already create an array where the "closest" elements are True. So you can use this result to set all other elements to zero.

import numpy as np
a = np.array([2.0, 5.1, 6.2, 7.9, 23.0])     # always increasing
b = np.array([5.1, 5.5, 5.7, 6.2, 00.0])     # also always increasing
a[~np.isclose(a,b, atol=0.5)] = 0
a

this returns array([ 0. , 5.1, 6.2, 0. , 0. ]).

But notice you want to set all elements that are not close, so you need to invert (~) the result.

edited Feb 28, 2016 at 14:52

answered Feb 28, 2016 at 14:44

MSeifert

154k41 gold badges356 silver badges377 bronze badges

Comments

Sci Prog · Accepted Answer · 2016-02-28 14:58:26Z

1

Try to better explain what your program should do in a more general way. Only giving arrays a, b and c does not tell what it should do. It is as if someone said "If A=5 and B=7, write a program so that C=20".

From what you tried, I'd guess that the task is "each element of c should be equal to the corresponding element of a if its value is near (difference of 0.5 or less) to the corresponding value in b. It should be zero if not."

Also, do you really need to use numpy? Try using only loops and list methods. You may also have a look at "Generator expressions and list comprehensions"

Finally, your title says "(...) and modifying 2nd array". There should not be a third array named c. The result should appear in a modified version of array b.

Edited: if the specification was really this, then the code could be

a = [2.0, 5.1, 6.2, 7.9, 23.0]
b = [5.1, 5.5, 5.7, 6.2, 0.0]
c = []
for x,y in zip(a,b): c.append( x if abs(x-y)<=0.5 else 0.0 )
print c

Which gives the following answer

[0.0, 5.1, 6.2, 0.0, 0.0]

BTW, if this is for a course, you could still get a bad grade for not following the specification ("...and modifying the 2nd array").

edited Feb 28, 2016 at 14:58

answered Feb 28, 2016 at 14:11

Sci Prog

2,6911 gold badge12 silver badges19 bronze badges

5 Comments

rNov Over a year ago

You guessed the task right. I tried alternately using loops instead of Numpy (see 2nd snippet). And the modifying 2nd array is what I need to do, but I would settle for third array c as it helps in keeping things simple for the time being.

Sci Prog Over a year ago

Hint: if you are beginning in python, you should stick to the base language (i.e. "Language reference" and "Library reference" in the python documentation). Don't try using external libraries (e.g. Numpy) yet.

rNov Over a year ago

It isn't for a course btw, I am a data science enthusiast and encountered this problem while working with different data sets. The arrays actually represent columns of a much larger data set.

Sci Prog Over a year ago

You could also modify b directly using array indices: for i in range(len(a)): b[i] = a[i] if abs(a[i]-b[i])<=0.5 else 0.0

Sci Prog Over a year ago

I didn't really think it was for a course: people who do that usually paste the homework assignment word for word. I wrote that to emphasize the fact that the first step when writing a program (whatever language you program in) starts by having a clear idea of the task to accomplish.

B. M. · Accepted Answer · 2016-02-28 17:53:32Z

1

It seems that you want to keep elements of a that are also in b.

A pure linear time python solution :

c=zeros_like(a)

j=0
n=len(c)
for i in range(n):
    while j<n and b[j]<a[i]-.1 : j+=1
    if j==n : break
    if abs(a[i]-b[j])<.1 : c[i]=a[i]

And a numpy solution for exact matching:

a*in1d(a,b).

in1d(a,b) indicates the places of elements of a that are in b : in1d(a,b) is [False, True, True, False, False]

Since True is 1 and False is 0 , a*in1d(a,b) is [ 0., 5.1, 6.2, 0. , 0. ] . Since in1d sorts a and b, it is a n ln n complexity algorithm, but generally faster. if approximative equality is required, a solution can be rounding the arrays first (np.round(a,1))

edited Feb 28, 2016 at 17:53

answered Feb 28, 2016 at 16:47

B. M.

18.7k2 gold badges40 silver badges56 bronze badges

2 Comments

rNov Over a year ago

I arrived with a solution pretty similar to yours using nested for loops only

B. M. Over a year ago

Yes, But it's a quadratic algorithm, it don't use the fact that arrays are sorted. It will be inefficient on big arrays.

rNov · Accepted Answer · 2016-02-28 17:39:14Z

0

This is the alternate way I was able to obtain the required arrangement for c.

import numpy as np

a = [2.0, 5.1, 6.2, 7.9, 23.0]  # always increasing
b = [5.1, 5.5, 5.7, 6.2, 00.0]  # also always increasing
c = np.zeros(len(a))

for i in range (len(a)):
    for j in range (len(a)):
        err = abs(a[i]-b[j])
        if err == 0.0 or err < abs(0.1):
            c[i] = b[j]

print c
#[ 0.   5.1  6.2  0.   0. ]

answered Feb 28, 2016 at 17:39

rNov

611 silver badge15 bronze badges

3 Comments

B. M. Over a year ago

Thanks. Now your aim is more clear. I give more efficient solutions in my post.

Sci Prog Over a year ago

In the ifstatement, abs(0.1) could be written without the abs function (abs(0.1) is 0.1). Also if err == 0.0, that also implies that err < 0.1, so the first condition is redundant. You could just write if err < 0.1:.

rNov Over a year ago

abs(0.1) is redundant in this case but when there are close enough values, err might be negative in some cases. So err == 0 and err < abs(0.1) would be required. I tried using different set of values just for testing the code and this statement worked like a charm.

Collectives™ on Stack Overflow

Python: Comparing all elements of two arrays and modifying 2nd array

5 Answers 5

4 Comments

Comments

5 Comments

2 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

4 Comments

Comments

5 Comments

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related