4

Suppose I have three matrices A_1, A_2, A_3 each of dimension mxn, where m and n are large. These matrices contain strictly positive numbers.

I want to construct a matrix check of dimension mx1 such that, for each i=1,...,m:

check(i)=1 if there exists j,k such that

A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,3:end), 0.95)

In my case m is large (m=10^5) and n=500. Therefore, I would like your help to find an efficient way to do this.


Below I reproduce an example. I impose m smaller than in reality and report my incomplete and probably inefficient attempt to construct check.

clear
rng default
m=4;
n=500;
A_1=betarnd(1,2,m,n);
A_2=betarnd(1,2,m,n);
A_3=betarnd(1,2,m,n);
check=zeros(m,1);
for i=1:m
    for j=1:m
        for k=1:m
            if A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,2:end), 0.95)
              check(i)=1;
              STOP TO LOOP OVER j AND k, MOVE TO THE NEXT i (INCOMPLETE!)              
           else
            KEEP SEARCHING FOR j,k SUCH THAT THE CONDITION IS SATISFIED (INCOMPLETE!) 
           end
       end
    end
end
7
  • 1
    What are A_2 and A_3 for? I don't see them being used in calculating check. Or did you mean A_2(j, ...) and A_3(k, ...)? Commented Sep 11, 2020 at 20:30
  • Thanks. Sorry, it was a typo Commented Sep 11, 2020 at 20:46
  • 1
    And is check of size mxn or mx1? You said it's mxn but defined it as mx1. Commented Sep 11, 2020 at 20:48
  • Thanks. Definitively mx1. Commented Sep 11, 2020 at 21:51
  • 1
    You can't cut the number of times you compute the quantile, which is the most expensive operation here. You won't be able to vectorize this either. The only optimization I can come up with so far is to move the indexing outside the loops: A_1_1 = A_1(:,1) and A_1 = A_1(:,2:end). That should speed up your code a little bit, I think. Or maybe not, but worth a try. Commented Sep 11, 2020 at 22:00

2 Answers 2

4
+50

Given a scalar x and a vector v the expression x <=quantile (v, .95) can be written as sum( x > v) < Q where Q = .95 * numel(v) *.

Also A_1 can be splitted before the loop to avoid extra indexing. Moreover the most inner loop can be removed in favor of vectorization.

Af_1 = A_1(:,1);
Af_2 = A_2(:,1);
Af_3 = A_3(:,1);
As_1 = A_1(:,2:end);
As_2 = A_2(:,2:end);
As_3 = A_3(:,2:end);
Q = .95 * (n -1);
for i=1:m
    for j=1:m
        if any (sum (Af_1(i) + Af_2(j) + Af_3 > As_1(i,:) + As_2(j,:) + As_3, 2) < Q)
            check(i) = 1;
            break; 
        end             
    end
end

More optimization can be achieved by rearranging the expressions involved in the inequality and pre-computation:

lhs = A_3(:,1) - A_3(:,2:end);
lhsi = A_1(:,1) - A_1(:,2:end);
rhsj = A_2(:,2:end) - A_2(:,1);
Q = .95 * (n - 1);
for i=1:m
    LHS = lhs + lhsi(i,:);
    for j=1:m
        if any (sum (LHS > rhsj(j,:), 2) < Q)
            check(i) = 1;
            break; 
        end             
    end
end
  • Note that because of the method that is used in the computation of quantile you may get a slightly different result.
Sign up to request clarification or add additional context in comments.

5 Comments

Nice! Removing the call to quantile must be a huge time saver!
Thanks. It still takes too much. Each iteration of j takes 0.002 sec. Hence, each iteration of i takes 0.002*m=200 sec. Thus, overall it takes 0.002*m^2=20000000 sec.
Note that because of break the computation related to the remaining j s may be discarded and overall time is probably less than your (worst case) estimate.
@user3285148 I have corrected the mistake in the answer. The expression sum( x <= v) >= Q is changed to to sum( x > v) < Q and the code is updated.
@user3285148: Do you have any indication that it is possible to compute this faster? You are computing something for 10^15 possible combinations, even if each combination were to take only one clock cycle, you're still looking at 2 GHz = 10^15 / 2.10^9 = 500000 s = 5.8 days. And the likelihood of computing a quantile over 500 values every clock cycle...
0

Option 1: Because all numbers are positive, you can do some optimizations. 95 percentile will be only higher if you add A1 to the mix - if you find the j and k of greatest 95 percentile of A2+A3 on the right side compared to the sum of the first 2 elements, you can simply take that for every i.

maxDif = -inf;
for j = 1 : m
   for k = 1 : m
      newDif = quantile(A_2..., 0.95) - A_2(j,1)-A_3(k,1);
      maxDif = max(newDif, maxDif);
   end
end

If even that is too slow, you can first get maxDifA2 and maxDifA3, then estimate that maxDif will be for those particular j and k values and calculate it.

Now, for some numbers you will get that maxDif > A_1, then the check is 1. For some numbers you will get that maxDif + quantile(A1, 0.95) < A_1, here check is 0 (if you estimated maxDif by separate calculation of A2 and A3 this isn't true!). For some (most?) you will unfortunately get values in between and this won't be helpful at all. Then what remains is option 2 (it is also more straightforward):

Option 2: You could save some time if you can save summation A_2+A_3 on the right side, as that calculation repeats for every different i, but that requires A LOT of memory. But quantile is the more expensive operation anyway, so you aren't saving a lot of time. Something along the lines of

for j = 1 : m
for k = 1 : m
A23R(j,k,:) = A2(j,:)+A3(k,:); % Unlikely to fit in memory.
end
end

Then you can perform your loops, using A23R and avoiding to repeat that sum for every i.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.