Vectorize this convolution type loop more efficiently in numpy

Question

I have to do many loops of the following type

for i in range(len(a)):
    for j in range(i+1):
        c[i] += a[j]*b[i-j]

where a and b are short arrays (of the same size, which is between about 10 and 50). This can be done efficiently using a convolution:

import numpy as np
np.convolve(a, b)

However, this gives me the full convolution (i.e. the vector is too long, compared to the for loop above). If I use the 'same' option in convolve, I get the central part, but what I want is the first part. Of course, I can chop off what I don't need from the full vector, but I would like to get rid of the unnecessary computation time if possible. Can someone suggest a better vectorization of the loops?

If the arrays are short, then why bother? Is this the bottleneck in your code? — Fred Foo
– Fred Foo, Commented Oct 10, 2012 at 9:51
@larsmans yes it is the bottleneck. I know it may not seem like a lot, but in principle I think the speedup could be a factor of 2, which would be great. Also, it could be interesting if someone wants to do it with a larger array. — jacob
– jacob, Commented Oct 10, 2012 at 10:23
@mtrw sorry should have specified this - between 10 and 50 (edited in question). One of the inputs changes for every call, the other stays the same, so I suppose it could be made static? — jacob
– jacob, Commented Oct 10, 2012 at 14:12
You may consider using convolution facility provided by scipy: docs.scipy.org/doc/scipy/reference/generated/… — hafiz031
– hafiz031, Commented Oct 30, 2020 at 13:13

jfs · Accepted Answer · 2012-10-10 14:39:53Z

3

You could write a small C extension in Cython:

# cython: boundscheck=False
cimport numpy as np
import numpy as np  # zeros_like

ctypedef np.float64_t np_t
def convolve_cy_np(np.ndarray[np_t] a not None,
                   np.ndarray[np_t] b not None,
                   np.ndarray[np_t] c=None):
    if c is None:
       c = np.zeros_like(a)
    cdef Py_ssize_t i, j, n = c.shape[0]
    with nogil:
        for i in range(n):
            for j in range(i + 1):
                c[i] += a[j] * b[i - j]
    return c

It performs well for n=10..50 compared to np.convolve(a,b)[:len(a)] on my machine.

Also it seems like a job for numba.

answered Oct 10, 2012 at 14:39

jfs

417k210 gold badges1k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

reptilicus Over a year ago

Yeah, I've never played with numba but it seems like it could be a great way to go.

jacob Over a year ago

Nice! It gives me approximately a 2x speedup factor. If I look at the generated C code, there is still some overhead in the for loops (checking for something < 0). Is it possible to get rid of this, too?

jacob Over a year ago

ok, I can set wraparound=False, but that does not improve speed. I suppose this is really the best answer. Thanks!

reptilicus · Accepted Answer · 2012-10-10 14:06:56Z

2

There is no way to do a convolution with vectorized array manipulations in numpy. Your best bet is to use np.convolve(a, b, mode='same') and trim off what you don't need. Thats going to probably be 10x faster than the double loop in pure python that you have above. You could also roll your own with Cython if you are really concerned about the speed, but it likely won't be much if any faster than np.convolve().

answered Oct 10, 2012 at 14:06

reptilicus

10.4k6 gold badges59 silver badges80 bronze badges

1 Comment

jacob Over a year ago

you are probably right that trimming off is the best solution. For my case I would do np.convolve(a,b)[:len(a)] (which is the 'full' mode, not the 'same' mode).

Collectives™ on Stack Overflow

Vectorize this convolution type loop more efficiently in numpy

2 Answers 2

3 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related