8

I am trying to make some piece of code more efficient by using the vectorized form in numpy. Let me show you an example so you know what I mean.

Given the following code:

a = np.zeros([4,4])
a[0] = [1., 2., 3., 4.]
for i in range(len(a)-1):
    a[i+1] = 2*a[i]
print a

It outputs

[[  1.   2.   3.   4.]
 [  2.   4.   6.   8.]
 [  4.   8.  12.  16.]
 [  8.  16.  24.  32.]]

When I now try to vectorize the code like this:

a = np.zeros([4,4])
a[0] = [1., 2., 3., 4.]
a[1:] = 2*a[0:-1]
print a

I just get the first iteration correct:

[[ 1.  2.  3.  4.]
 [ 2.  4.  6.  8.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]

Is it possible to write the code above efficiently in a vectorized form (where the next iteration always accesses the previous iteration) or do I have to keep the for loop?

3 Answers 3

7

A linear recurrence such as this can be computed using scipy.signal.lfilter:

In [19]: from scipy.signal import lfilter

In [20]: num = np.array([1.0])

In [21]: alpha = 2.0

In [22]: den = np.array([1.0, -alpha])

In [23]: a = np.zeros((4,4))

In [24]: a[0,:] = [1,2,3,4]

In [25]: lfilter(num, den, a, axis=0)
Out[25]: 
array([[  1.,   2.,   3.,   4.],
       [  2.,   4.,   6.,   8.],
       [  4.,   8.,  12.,  16.],
       [  8.,  16.,  24.,  32.]])

See the following for more details: python recursive vectorization with timeseries, Recursive definitions in Pandas


Note that using lfilter really only makes sense if you are solving a nonhomogeneous problem such as x[i+1] = alpha*x[i] + u[i], where u is a given input array. For the simple recurrence a[i+1] = alpha*a[i], you can use the exact solution a[i] = a[0]*alpha**i. The solution for multiple initial values can be vectorized using broadcasting. For example,

In [271]: alpha = 2.0

In [272]: a0 = np.array([1, 2, 3, 4])

In [273]: n = 5

In [274]: a0 * (alpha**np.arange(n).reshape(-1, 1))
Out[274]: 
array([[  1.,   2.,   3.,   4.],
       [  2.,   4.,   6.,   8.],
       [  4.,   8.,  12.,  16.],
       [  8.,  16.,  24.,  32.],
       [ 16.,  32.,  48.,  64.]])
Sign up to request clarification or add additional context in comments.

Comments

5

Numpy's vector calculations act on the vector, not as a sequence of steps, so you have to vectorize the entire expression. For example:

np.multiply(np.arange(1,5), 2**np.arange(0,4)[np.newaxis].T)

To address the "final" question, yes you have to keep the for loop if you want to do a sequential calculation. You might make it more efficient with map or [... for ...] but optimizing that way takes a lot of trial and error. The beauty of thinking in vectorial terms and using Numpy to implement is that you get a result efficiently without all the trial and error.

The cumsum and cumprod functions can do something similar to what you're asking for. Instead of 2**np.arange(...), you can get the same thing from

np.multiply(np.arange(1,5), np.cumprod([1,2,2,2,])[np.newaxis].T)

Comments

2

looking at your result matrix

result = [[ 1, 2,   3, 4,]
          [ 2, 4,   6, 8,]
          [ 4, 8,  12, 16,]
          [ 8, 16, 24, 32,]]

it can be deconstructed into product(elem-wise) of two matrix as

a = [[1, 2, 3, 4],
    [1, 2, 3, 4],
    [1, 2, 3, 4],
    [1, 2, 3, 4]]

b = [[1, 1, 1, 1],
     [2, 2, 2, 2],
     [4, 4, 4, 4]
     [8, 8, 8, 8]]

result = a * b

you can calculate this type of operation using the meshgrid function

aa, bb = np.meshgrid(np.array([1.0, 2.0, 3.0, 4.0]),
                     np.array([1.0, 2.0, 4.0, 8.0]))
result = aa * bb

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.