3

I am looking for a high-performance way to zero-fill values that are mutually exclusive between two arrays. This data is for a JS chart that must have entries for every x-value. An example may explain this better:

Before:

obj1 = [{x:1, y:1}, {x:3, y:2}];
obj2 = [{x:2, y:2}, {x:4, y:4}];

After:

obj1 = [{x:1, y:1}, {x: 2, y:0}, {x:3, y:2}, {x:4, y:0}];
obj2 = [{x:1, y:0}, {x: 2, y:2}, {x:3, y:0}, {x:4, y:4}];

I used nested for loops to do this myself but as the number of objects & entries increases, wall time gets unacceptably high. In a dataset that ended up zero-filling to a few thousand entries total, wall time was over 10 seconds.

I've looked at some JS libraries such as jQuery and underscore but it's not clear they have better performing functions for this.

Update: Thanks for all the responses. I'll try them out and mark the best performing one as the answer. A note on the x values: They aren't necessarily monotonically increasing (obj1 & 2 can skip an x value as long as both of them do). The x-axis isn't necessarily numbers, it could be dates as well. Hopefully one or more of the answers are adaptable to this.

4 Answers 4

1

Basically create a hash of all the values, along with a hash of all the values in each object. Then fill the object with the hashes in the 'all' hash that don't exist in the 'individual' hash

// hash of unique x values
var xValues = {};

// store each distinct x value
walk( obj1, 'obj1' );
walk( obj2, 'obj2' );

// fill each array with missing values
fill( obj1, 'obj1' );
fill( obj2, 'obj2' );

function walk( obj, nm ){
    xValues[ nm ] || ( xValues[ nm ] = {} );
    xValues.all   || ( xValues.all   = {} );

    for( var i=0, l=obj.length; i<l; i++ ){
        xValues[ nm ][ obj[ i ].x ] = 1;
        xValues.all  [ obj[ i ].x ] = 1;
    }
}

function fill( obj, nm ){
    for( var key in xValues.all ){
        if( !( key in xValues[ nm ] ) ){
            obj.push( { x : key, y : 0 } );
        }
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

This approach cut the zero-fill time from 5 secs to milliseconds on this test data set and CPU. I created a dictionary of all the x-values from each array (data layer). I also created temporary dictionaries for each data layer to speed up lookups (avoids iterating thru the array or using indexOf). Looking up is very quick: for (var entry in all_x_vals) { if (typeof tmp_dict[entry] == 'undefined') ZeroFillArrayEntry(); } ----- Not exact code but hopefully gets the point across
1

How about the following approach (with pseudocode)

1) Convert it into a array with x being the index.

var arr = [];
for each object in input_list
    arr[object.x] = object.y

2) Loop through the above array and fill undefined with zeros

arr2 = arr.map -> return (typeof value !== 'undefined') value : 0

3) Convert the array back into the object

result = arr2.map -> return { x : index, y: value }

PS: You can optimize it further by combining step 2 and 3 to save another loop.

2 Comments

please don't answer in coffeescript unless the OP asks for it... things like arr.map -> return aren't clear to people that aren't familiar with the syntax
It is a pseudocode (as mentioned in my answer). OP is asking for guidance, not "teh codez".
0

Adding another answer that makes an assumption that your data is pre-sorted. If it's not pre-sorted, sort it and this will work. It has the advantage of minimal memory usage, is very fast, and your data will be sorted when done:

var maxX = Math.max(
      obj1[ obj1.length-1 ].x
    , obj2[ obj2.length-1 ].x
);

fill( obj1, maxX );
fill( obj2, maxX );

function fill( obj, max ){
    for( var i=0; i<max; i++ ){
        if( !obj[i] || ( obj[i].x !== i+1 ) ){
            obj.splice( i, 0, { x:i+1, y:0 });
        }
    }
}

1 Comment

I just realized that the answer I posted was essentially the same as yours. Don't understand the downvote on yours though. This seems to be the most simple attempt. +1
-1

Here is another way of doing it. Using natively implemented methods as much as possible for performance.

var obj1 = [{x:1, y:1}, {x:3, y:2}];
var obj2 = [{x:2, y:2}, {x:4, y:4}];

// get the x values from each array
var xGetter = function(i) { return i.x; };
var obj1xs = obj1.map(xGetter);
var obj2xs = obj2.map(xGetter);

// get the joined array
var joined = obj1.concat(obj2);

// get all x values
var xs = joined.map(xGetter);

// get the min and max values of x from both arrays combined
var min = Math.min.apply(null, xs), max = Math.max.apply(null, xs), i = min;

// fill the missing x values with zero y value
if(min < max) {
  while(i<=max) {
    if(obj1xs.indexOf(i) === -1) obj1.push({x: i, y: 0});
    if(obj2xs.indexOf(i) === -1) obj2.push({x: i, y: 0});
    i++;
  }
}

// sort the arrays
var mySorter = function(a, b) { return a.x - b.x; };
obj1 = obj1.sort(mySorter);
obj2 = obj2.sort(mySorter);

The output will be:

obj1 => [{"x":1, "y":1}, {"x":2, "y":0}, {"x":3, "y":2}, {"x":4, "y":0}]
obj2 => [{"x":1, "y":0}, {"x":2, "y":2}, {"x":3, "y":0}, {"x":4, "y":4}]

1 Comment

This is going to have similar performance problems to the OPs "failed" code due to the indexOf's

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.