In this situation I would opt against fully vectorizing your solution. Calculating S(i,j)-S(i,k) for each combination would mean an intermediate result of size [N,N,N]. Instead I went through your code and eliminated as much iteration as possible without increasing the memory consumption. Step by step so you can understand how I ended up there.
N=30;
S=rand(N,N);
W=rand(N,N)<.1;
sum_temp=0;
temp=0;
%Your original code for reference
for i=1:N
for j=1:N
for k=1:N
if W(j,k)~=0
temp(k)=S(i,j)-S(i,k);
end
end
sum_temp=max(temp)+sum_temp;
temp=0;
end
B(i,i)=sum_temp;n
sum_temp=0;
end
B_orig=B;
%1) you only want the max, no need to make temp a vector
for i=1:N
sum_temp=0;
for j=1:N
temp=0;
for k=1:N
if W(j,k)~=0
temp=max(temp,S(i,j)-S(i,k));
end
end
sum_temp=temp+sum_temp;
end
B(i,i)=sum_temp;
end
assert(all(all(B==B_orig)))
%2) eliminate the outer loop
sum_temp=zeros(N,1);
for j=1:N
temp=zeros(N,1);
for k=1:N
if W(j,k)~=0
temp=max(temp,S(:,j)-S(:,k));
end
end
sum_temp=temp+sum_temp;
end
B=diag(sum_temp);
assert(all(all(B==B_orig)))
%3) combine the inner loop with the condition
sum_temp=zeros(N,1);
for j=1:N
temp=zeros(N,1);
for k=find(W(j,:))
temp=max(temp,S(:,j)-S(:,k));
end
sum_temp=temp+sum_temp;
end
B=diag(sum_temp);
assert(all(all(B==B_orig)))
tempis never reset to 0. Before we start trying to optimize this, could you confirm that this is correct?