Fast way to compute if statements on arrays in python?

Question

Assume three numpy arrays x, y and z

    z = (x**2)/ y          for each  x > 2 y
    z = (x**2)/y**(3/2)    for each  x > 3 y
    z = (1/x)*sin(x)       for each  x > 4 y

The array x, y and z are of-course made up but they illustrate the point of operating multiple if statements on multiple arrays. The arrays x, y and z are about 500,000 elements each.

One possible way (much like FORTRAN) is to create a variable i to index the arrays and use it to test if x[i] > 2*y[i] or x[i] > 3*y[i]. I assume it would be slow.

I need a fast, elegant and a more pythonic way to compute the array z.

UPDATE: I have tried the two methods and here are the results:

   # Fortran way of loops: 
   import numpy as np

   x=np.random.rand(40000,1)
   y=np.random.rand(40000,1)

   z = np.zeros(x.shape)
   for i, v in enumerate(x):
        #print i
        if x[i] >2*y[i]:
            z[i]= x[i]**2/y[i]
        if x[i] > 3*y[i]:
            z[i]=x[i]**2/y[i]**(1.5)
        if x[i] > 4*y[i]:
            z[i] = (1/x[i])*np.sin(x[i])

    z = np.zeros(x.shape)
    print z
    #end----

The timing results are as follows:

    real    0m0.920s
    user    0m0.900s
     sys    0m0.016s

The other piece of code used is:

    # Pythonic way
    import numpy as np

    x=np.random.rand(40000,1)
    y=np.random.rand(40000,1)

    indices1 = np.where(x > 2*y)
    indices2 = np.where(x > 3*y)
    indices3 = np.where(x > 4*y)

    z = np.zeros(x.shape)
    z[indices1] = x[indices1]**2/y[indices1]
    z[indices2] = x[indices2]**2/y[indices2]**(1.5)
    z[indices3] = (1/x[indices3])*np.sin(x[indices3]) 
    print z
    # end of code -----

The timing results are as follows:

    real    0m0.110s
    user    0m0.076s
     sys    0m0.028s

So there is a large difference in the execution times. The two pieces were run on a ubuntu virtual machine with python 2.7.5

UPDATE: I did another test using

    indices1 = x > 2*y
    indices2 = x > 3*y
    indices3 = x > 4*y

The timing results were:

     real   0m0.105s
     user   0m0.084s
      sys   0m0.016s

SUMMARY: Method 3 is the most elegant and slightly faster than using np.where. Using explicit loops is very slow.

Have you tried to benchmark your idea? Show us what you have done. — tyteen4a03
– tyteen4a03, Commented Jan 24, 2015 at 5:11

David · Accepted Answer · 2015-01-24 06:00:44Z

2

I'm not quite sure if you are looking to have your z array be the same size as x or y, but I will assume so.

Numpy has a function that can find the indices of elements based on a condition. In the example below I am doing a calculation similar to what your first line does.

import numpy as np

x = np.arange(4)
x[2:] += 10
print x

y = np.arange(4)
print y

indices = np.where(x > 2*y)
print indices

z = np.zeros(x.shape)
z[indices] = x[indices]**2/y[indices]
print z

The print statements yield the following:

x: [0 1 12 13]

y: [0 1 2 3]

indices: [2, 3]

z: [0 0 72 56]

Edit: Upon further testing it turns out that you don't even need to use the numpy where function. You can simply set indices = x > 2*y.

edited Jan 24, 2015 at 6:00

answered Jan 24, 2015 at 5:27

David

6961 gold badge6 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Fast way to compute if statements on arrays in python?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related