1

I have the following code in MATLAB

[Mx,Nx] = size(x);
[My,Ny] = size(y);

padded_x = zeros(Mx+2*(My-1),Nx+2*(Ny-1));
padded_x(My:Mx+My-1,Ny:Ny+Nx-1) = x; 
y = rot90(y,2);

z = zeros(Mx+My-1,Nx+Ny-1);
for i=1:Mx+My-1
    for j=1:Nx+Ny-1
        z(i,j) = sum(sum(padded_x(i:i+My-1,j:j+Ny-1).*y));
    end
end

that is part of a 2D convolution implementation. Is there any way that this can become faster? e.g. vectorizing these 2 for loops? I know that there are faster algorithms that compute 2D convolution, but I want to speed this one up. So, I am not looking for an algorithm with different complexity, just something with lower complexity constant. I would also like to keep this in MATLAB and not to use MEX - files etc. Finally, the supplied conv2 function is also not the solution I am looking for.

8
  • 4
    and you don't want to use the supplied conv2 function? Commented Apr 24, 2015 at 10:59
  • no, sorry I forgot to mention that! Actually conv2 uses FFT, so I thought it was obvious that it belongs to the "different algorithm" category Commented Apr 24, 2015 at 11:02
  • @Controller I was not aware that conv2 used fft. It is not very easy to find information about that either. However, does it matter if it does? Commented Apr 24, 2015 at 11:21
  • @patrik Well, yes it does in this case, because it leads to a different complexity. The concept is to see how these two algorithms (mine and conv2) scale with the size of inputs. I am just trying to find out if there can be a faster implementation of my algorithm in order to better approximate the difference between the two algorithms. Commented Apr 24, 2015 at 11:32
  • What are the typical datasizes for x and y? Commented Apr 24, 2015 at 11:47

1 Answer 1

4

For each iteration, you can replace the elementwise multiplication and double summations with a fast matrix multiplication.

That is -

z(i,j) = sum(sum(padded_x(i:i+My-1,j:j+Ny-1).*y));

would be replaced by -

M = padded_x(i:i+My-1,j:j+Ny-1);
z(i,j) = M(:).'*y(:);

Thus, the loopy portion of the original code could be replaced by -

z = zeros(Mx+My-1,Nx+Ny-1);
yr = y(:);
for i=1:Mx+My-1
    for j=1:Nx+Ny-1
         M = padded_x(i:i+My-1,j:j+Ny-1);
        z(i,j) = M(:).'*yr;
    end
end

Quick tests: With x and y as 200 x 200 each, the runtimes were -

------------------------- With Original Approach
Elapsed time is 10.357977 seconds.
------------------------- With Proposed Approach
Elapsed time is 5.209822 seconds.
Sign up to request clarification or add additional context in comments.

2 Comments

Could do z(i,j) =yr * M(:); in the inner loop instead if you do yr=y(:).' upfront to squeeze just a bit more maybe
@hiandbaii Ah good observation there! OP must consider that hack too!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.