1

I'm trying to see if I can improve the speed of a for loop and an if condition statement. Basically it does a lookup on non repeating key values into an array and gets the value from another column.

If I run 100000 values it takes about 13 seconds see code below. Is there a way to make this more efficient? Ps i'm using octave 3.8.1 which works with matlab

%test if lookup statment
clear all, clc,  tic, clf; 

num_to_test=100000 %amount of numbers to test
a1=(1:1:num_to_test)';
a2=(a1.*num_to_test);
array=[a1,a2]; %array where values are stored

lookupval=(randperm(num_to_test,num_to_test/2)/4)'; %lookup these random values of non repeating integers and floats and get another value

amp=[];
freq=[];
found_array=[];
notfound_array=[];

for ii=1:1:rows(lookupval)
    if (find(lookupval(ii)==array(:,1)))  %if you find a lookup value in array
        %disp('found');
        [row,col] = find(lookupval(ii) == array(:,1));
        amp=[amp;array(row,2)];
        freq=[freq;array(row,1)];
        found_array=[freq,amp];

    else %add lookup value to another array and make amp value zero

        notfound_arraytmp=[lookupval(ii),0];
        notfound_array=[notfound_array;notfound_arraytmp];
    endif
end
comb_array=[found_array;notfound_array];
sort_comb_array=sortrows(comb_array,1); %sort array by first col incrementing

fprintf('\nfinally Done-elapsed time -%4.4fsec- or -%4.4fmins- or -%4.4fhours-\n',toc,toc/60,toc/3600);
0

4 Answers 4

2

Approach #1

This could be really efficient with ismember -

lookupval = sort(lookupval);                     %// Do sorting at the start
sort_comb_array = [lookupval zeros(size(lookupval))]; %// Setup output array
[idA,idB] = ismember(array(:,1),lookupval);           %// Get matching IDs
sort_comb_array(idB(idA),2) = array(idA,2);  %// Index into second column
                                   %// of array and get corresponding values

Approach #2

I would thrown in my favorite bsxfun too, but for such huge datasizes of 100,000, its memory inefficiency could make it slower -

lookupval = sort(lookupval);
sort_comb_array = [lookupval zeros(size(lookupval))];
[idA,idB] = find(bsxfun(@eq,array(:,1),lookupval(:).')); %//'# Get matching IDs
sort_comb_array(idB,2) = array(idA,2);
Sign up to request clarification or add additional context in comments.

Comments

2

Several issues but the main one is probably that you don't preallocate - appending like this: amp=[amp;array(row,2)]; is generally slow in MATLAB. You don't need a loop here, though.

Let's start with a simple array, A:

1  500
2  700
3  900
7  1000
9  800

And our lookup values are [2 6 3 9 7]; We want our output to show these lookup values, sorted, in the first column, and the second column to be either the values from the second column of A (where they exist) or zero.

lookup = sort(lookup);
output = zeros(length(lookup),2);
output(:,1) = lookup;
[c a b ] = intersect(A(:,1),lookup);
output(b,2) = A(a,2);

The output is:

2 700
3 900
6 0
7 1000
9 800

8 Comments

Thanks for the help. I ran your code online and the output doesn't match up (it makes all the lookup values 0) did I miss something? here's a link to what I ran ideone.com/bxMhig
@RickT I'd be interested to see how the other ismember based solution performs on Octave for this problem!
@Divakar Sure testing and comparing them now
@Divakar I posted the results as an answer
@RickT Make sure to clear out memory before the start of a new approach and sufficient number of iterations to make the timings around or more than 1sec.
|
0

Purely from an efficiency standpoint, I would rewrite the for loop as follows:

m = 0;                          % number of omitted values
n = 0;                          % number of found values
for ii=1:1:rows(lookupval)

    [row,col] = find(lookupval(ii) == array(:,1));

    if ~isempty(row)  %if you find a lookup value in array
        %disp('found');
        n=n+1;
        amp(n)=array(row,2);
        freq(n)=;array(row,1);
        found_array=[freq,amp];

    else %add lookup value to another array and make amp value zero
        m=m+1;
        notfound_array(2*m-1:2*m)=[lookupval(ii);0];
    endif
end

This saves you a find call by using its output directly rather than recomputing it when find returns a position, and grows the arrays in a more efficient way (as shown in this question).

Comments

0

This is a test Divakar suggest I do to see the speed it takes octave 3.8.1 to run this. Results are below along with the code.

1) Using ismember with 2,000,000 is faster but uses more memory
-elapsed time -0.2306sec- or -0.0038mins-
Total is 15000001 elements using 106000008 bytes

2) Using intersect with 2,000,000 is slower but uses less memory.
-elapsed time -0.3057sec- or -0.0051mins-
Total is 11749047 elements using 93992376 bytes

3) Using bskfun with 100,000 produces an error: out of memory or dimensions too large for octave's index type

First test results:

clear all, clc,  tic, clf; 
num_to_test=2000000 %amount of numbers to test
a1=(1:1:num_to_test)';
a2=(a1.*num_to_test);
array=[a1,a2]; %array where values are stored

lookupval=(randperm(num_to_test,num_to_test/2)/4)'; %lookup these random vaules of intergers and floats and get another value

lookupval = sort(lookupval);
sort_comb_array = [lookupval zeros(size(lookupval))];
[idA1,idB1] = ismember(array(:,1),lookupval);
sort_comb_array(idB1(idA1),2) = array(idA1,2);

fprintf('\nfinally Done-elapsed time -%4.4fsec- or -%4.4fmins- or -%4.4fhours-\n',toc,toc/60,toc/3600);

whos

>>>num_to_test =  2000000
>>>


   finally Done-elapsed time -0.2306sec- or -0.0038mins- or -0.0001hours-
>>>Variables in the current scope:

   Attr Name                  Size                     Bytes  Class
   ==== ====                  ====                     =====  ===== 
        a1              2000000x1                   16000000  double
        a2              2000000x1                   16000000  double
        array           2000000x2                   32000000  double
        idA1            2000000x1                    2000000  logical
        idB1            2000000x1                   16000000  double
        lookupval       1000000x1                    8000000  double
        num_to_test           1x1                          8  double
        sort_comb_array 1000000x2                   16000000  double

Total is 15000001 elements using 106000008 bytes
========================================================================

Second test results:

clear all, clc,  tic, clf; 

num_to_test=2000000 %amount of numbers to test
a1=(1:1:num_to_test)';
a2=(a1.*num_to_test);
array=[a1,a2]; %array where values are stored

lookupval=(randperm(num_to_test,num_to_test/2)/4)'; %lookup these random vaules of intergers and floats and get another value

lookupval = sort(lookupval);
output = zeros(length(lookupval),2);
output(:,1) = lookupval;
[c a b ] = intersect(array(:,1),lookupval);
output(b,2) =array(a,2);

fprintf('\nfinally Done-elapsed time -%4.4fsec- or -%4.4fmins- or -%4.4fhours-\n',toc,toc/60,toc/3600);

whos

>>>num_to_test =  2000000
>>>
finally Done-elapsed time -0.3057sec- or -0.0051mins- or -0.0001hours-
>>>Variables in the current scope:

   Attr Name              Size                     Bytes  Class
   ==== ====              ====                     =====  ===== 
        a            250005x1                    2000040  double
        a1          2000000x1                   16000000  double
        a2          2000000x1                   16000000  double
        array       2000000x2                   32000000  double
        b            250005x1                    2000040  double
        c            250005x1                    2000040  double
        lookupval   1000000x1                    8000000  double
        num_to_test       1x1                          8  double
        output      1000000x2                   16000000  double

Total is 11750016 elements using 94000128 bytes


=======================================================================

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.