Speed up convolution computation code

Question

I have the following code in MATLAB

[Mx,Nx] = size(x);
[My,Ny] = size(y);

padded_x = zeros(Mx+2*(My-1),Nx+2*(Ny-1));
padded_x(My:Mx+My-1,Ny:Ny+Nx-1) = x; 
y = rot90(y,2);

z = zeros(Mx+My-1,Nx+Ny-1);
for i=1:Mx+My-1
    for j=1:Nx+Ny-1
        z(i,j) = sum(sum(padded_x(i:i+My-1,j:j+Ny-1).*y));
    end
end

that is part of a 2D convolution implementation. Is there any way that this can become faster? e.g. vectorizing these 2 for loops? I know that there are faster algorithms that compute 2D convolution, but I want to speed this one up. So, I am not looking for an algorithm with different complexity, just something with lower complexity constant. I would also like to keep this in MATLAB and not to use MEX - files etc. Finally, the supplied conv2 function is also not the solution I am looking for.

no, sorry I forgot to mention that! Actually conv2 uses FFT, so I thought it was obvious that it belongs to the "different algorithm" category — Controller
– Controller, Commented Apr 24, 2015 at 11:02
@Controller I was not aware that conv2 used fft. It is not very easy to find information about that either. However, does it matter if it does? — patrik
– patrik, Commented Apr 24, 2015 at 11:21
@patrik Well, yes it does in this case, because it leads to a different complexity. The concept is to see how these two algorithms (mine and conv2) scale with the size of inputs. I am just trying to find out if there can be a faster implementation of my algorithm in order to better approximate the difference between the two algorithms. — Controller
– Controller, Commented Apr 24, 2015 at 11:32

Community · Accepted Answer · 2017-05-23 12:30:19Z

4

For each iteration, you can replace the elementwise multiplication and double summations with a fast matrix multiplication.

That is -

z(i,j) = sum(sum(padded_x(i:i+My-1,j:j+Ny-1).*y));

would be replaced by -

M = padded_x(i:i+My-1,j:j+Ny-1);
z(i,j) = M(:).'*y(:);

Thus, the loopy portion of the original code could be replaced by -

z = zeros(Mx+My-1,Nx+Ny-1);
yr = y(:);
for i=1:Mx+My-1
    for j=1:Nx+Ny-1
         M = padded_x(i:i+My-1,j:j+Ny-1);
        z(i,j) = M(:).'*yr;
    end
end

Quick tests: With x and y as 200 x 200 each, the runtimes were -

------------------------- With Original Approach
Elapsed time is 10.357977 seconds.
------------------------- With Proposed Approach
Elapsed time is 5.209822 seconds.

edited May 23, 2017 at 12:30

CommunityBot

11 silver badge

answered Apr 24, 2015 at 11:57

Divakar

222k19 gold badges273 silver badges374 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

hiandbaii Over a year ago

Could do z(i,j) =yr * M(:); in the inner loop instead if you do yr=y(:).' upfront to squeeze just a bit more maybe

Divakar Over a year ago

@hiandbaii Ah good observation there! OP must consider that hack too!

Collectives™ on Stack Overflow

Speed up convolution computation code

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related