36

I've got large json array of objects that I need to filter down based on multiple user select inputs. Currently I'm chaining filter functions together but I've got a feeling this is most likely not the most performant way to do this.

Currently I'm doing this:

var filtered = data.filter(function(data) {
    return Conditional1
  })
  .filter(function(data) {
    return Conditional2
  })
  .filter(function(data) {
    return Conditional3
  }) etc...;

Although (I think) with each iteration 'data' could be less, I'm wondering if a better practice would be to do something like this:

var condition1 = Conditional1
var condition2 = Conditional2
var condition3 = Conditional3
etc...

var filtered = data.filter(function(data) {
  return condition1 && condition2 && condition3 && etc...
});

I've looked into multiple chains of higher order functions, specifically the filter function - but I haven't seen anything on best practice (or bad practice, nor have I timed and compared the two I've suggested).

In a use case with a large data set and many conditionals which would be preferred (I reckon they are both fairly easily readable)?

Or maybe there is a more performant way that I'm missing (but still using higher-order functions).

3
  • My instinct says your second approach is better - it involves a single loop, no matter how many conditions you have. And it prevents checking an item over and over if it is going to be eliminated by the last condition, etc. Commented Feb 16, 2018 at 20:27
  • 1
    The same number of conditions are tested in the two versions, the main difference is that the second version doesn't need to create the intermediate arrays and loop over them. Commented Feb 16, 2018 at 20:35
  • 2
    I think that there is no performance difference. Both will execute n * 3 conditions each, its just the loop overhead that makes it slower. Commented Feb 16, 2018 at 20:41

7 Answers 7

43

Store your filter functions in an array and have array.reduce() run through each filter, applying it to the data. This comes at the cost of running through all of them even when there's no more data to filter.

const data = [...]
const filters = [f1, f2, f3, ...]
const filteredData = filters.reduce((d, f) => d.filter(f) , data)

Another way to do it is to use array.every(). This takes the inverse approach, running through the data, and checking if all filters apply. array.every() returns false as soon as one item returns false.

const data = [...]
const filters = [f1, f2, f3, ...]
const filteredData = data.filter(v => filters.every(f => f(v)))

Both are similar to your first and second samples, respectively. The only difference is it doesn't hardcode the filters or conditions.

Sign up to request clarification or add additional context in comments.

1 Comment

i love the .reduce with array of funcs, you can also easily map and transform. all that while keeping clean and readable code
41

interesting question

data = new Array(111111).fill().map((a,n) => n);

const f1 = (a) => a % 2;
const f2 = (a) => a % 5;
const f3 = (a) => a > 347;
const filters = [f1, f2, f3];

var benches = [
  [ "filter().filter.. - 1", () => {
    var res = data.filter(a=>a%2).filter(a=>a%5).filter(a=>a>347);
  }],
  [ "filter(&& &&) - 2", () => {
    var res = data.filter(a=>a%2 && a%5 && a>347);
  }],
  [ "reduce - 3", () => {
    var res = filters.reduce((d, f) => d.filter(f) , data);
  }],
  [ "filter(every) - 4", () => {
    var res = data.filter(v => filters.every(f => f(v)))
  }],
];

function bench(f) {
  var t0 = performance.now();
  var res = f();
  return performance.now() - t0;
}

var times = benches.map( a => [a[0], bench(a[1])] )
            .sort( (a,b) => a[1]-b[1] );
var max = times[times.length-1][1];
times = times.map( a => {a[2] = (a[1]/max)*100; return a; } );
var template = (title, time, n) =>
  `<div>` +
    `<span>${title} &nbsp;</span>` +
    `<span style="width:${3+n/2}%">&nbsp;${Number(time.toFixed(3))}msec</span>` +
  `</div>`;
var strRes = times.map( t => template(...t) ).join("\n");
var $container = document.getElementById("container");
$container.innerHTML = strRes;
body { color:#fff; background:#333; font-family:helvetica; }
body > div > div {  clear:both   }
body > div > div > span {
  float:left;
  width:43%;
  margin:3px 0;
  text-align:right;
}
body > div > div > span:nth-child(2) {
  text-align:left;
  background:darkorange;
  animation:showup .37s .111s;
  -webkit-animation:showup .37s .111s;
}
@keyframes showup { from { width:0; } }
@-webkit-keyframes showup { from { width:0; } }
<div id="container"> </div>

also remember in for-loop and for example with case of two loops one 3000 and one 7 then : 3000x7 > 7x3000 in time measuring .

6 Comments

Why the performance of last approach is best. It would be helpful to understand the reason. Would you please explain?
@VimalPatel it's not, the second one is the best one in case of static filters and the third one for dynamic filters. last one is the worse one. what is your browser?
@nullqube Chrome, as per stats 3rd one is the worse I thought
It's pretty obvious why the second approach is the best (like 2-3x faster than the others for me). It only does one iteration per element, and creates only one array. The first one creates 3 arrays and goes through some elements again. The third one is pretty similar to the first one. As to why the fourth one is also slow, I have no idea. To me it looks similar to the second one, except it uses dynamic filters.
I'm surprised to hear 4 is slow. 27,7,20,14 on pixel3a.
|
2

The two options are not exactly the same, although they could produce the same result

var filtered = data.filter(function(data) {
    return Conditional1
  })
  .filter(function(data) {
    return Conditional2
  })
  .filter(function(data) {
    return Conditional3
  }) etc...;

That option is better if you want to check the conditions independently of one another. You should use that if you need the data filtered by condition1 before filtering by condition2. If what you want is to filter items that match the 3 conditions or a combination of them, use the second one:

var condition1 = Conditional1
var condition2 = Conditional2
var condition3 = Conditional3
etc...

var filtered = data.filter(function(data) {
  return condition1 && condition2 && condition3 && etc...
});

Comments

1

If you think of this as a "for-loop" optimization problem, you can see that the original approach results in iterating the list multiple times.

Your second approach will reduce the iterations to one pass.

After that, you're just looking at the best way to quickly decide whether an item passes muster or not.

Comments

1

not sure about performance but I am in love with reduce method in javascript these days. Something like:

arr.reduce((itemMap, item) => {
    if (item.something !== somethingElse) return itemMap;
    return itemMap.push(item);
}, [])

this works essentially like .filter but you can do much more with it. Like if you want to even map some values, you could do so by updating the item object and returning that item if it matches all the conditionals. Although, not sure how performant this is..

3 Comments

may I know exactly why down vote? This is a higher order function and on top of that, it is more readable. Doing essentially what fitler does and gives you more freedom with your array
Guess downvote because you are pushing to an array, instead of filtering, which causes more memory consumption
actually it doesn't since it is done through reference if it is an object anyway. Besides: arr.filter(f1).filter(f2).filter(f3) is slower than the solution that I provided above.
0

Not the fastest to get all results in an array, but if you just want to get an iterator over the final result, without creating an array, then ECMAScript 2025 offers a filter method on native iterators (instances of Iterator):

const f1 = (a) => a % 2;
const f2 = (a) => a % 5;
const f3 = (a) => a > 347;

const it = Array(111111).keys().filter(f1).filter(f2).filter(f3);
const [first] = it;
console.log(first);

You can of course still use the && operator in a single filter callback.

Using iterator helpers will save on array creation, and will typically yield the first result sooner, as the involved iterators are lazy, so you get the first result before the next ones are consumed.

Comments

-1
type PredicateFn = (input: any) => boolean

const isBig: PredicateFn = (n: number): boolean => {
  return n > 100
}
const isEven: PredicateFn = (n: number): boolean => {
  return n % 2 === 0
}

const isInt: PredicateFn = (n: number): boolean => {
  return !n.toString().includes('.')
}


function composeFilters<T>(array: T[], predicates: PredicateFn[]): T[] {
  function filter(input: T) {
    return predicates.every(predicate => predicate(input))
  }
  return array.filter(filter)
}

const input = [1, 2.5, 101, 100.5, 110, 220.24, 333, 400, 500, 223, 111]

const result = composeFilters(input, [isBig, isInt, isEven])

console.log(result)

4 Comments

TypeScript answer for JavaScript question? Why?
@user The question is four years old. I'm pretty sure the original guy isn't sitting around waiting for an answer on this, and most people write typescript now.
You could at least give an explicit label that you're giving a TypeScript answer to a JavaScript question to prevent confustion to new readers who are reading a JavaScript question.
@user 🤫 I think everything will be ok

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.