-1

For large arrays (i.e. 1M elements), is it faster to create and iterate over a Set (from an Array) than an Array - if you just need a single or two iterations to occur ever and performance is the only requirement (no need for deduping the array etc)?

Consider the following two scenarios for illustration of what I mean:

const array = Array.from({length: 1_000_000});

const methodOne = () => { array.forEach(() => {...}) }
const methodTwo = () => { const set = new Set(array); set.forEach(() => {...}) }

// end of program

In browser when timed, this goes in favor of the first approach (somewhat logical). But some evidence, such as this, states that set traversals are much faster than array traversals, so I'm wondering if it's ever worthwhile to create a Set off an Array just to traverse it one or more times.

I'm guessing it boils down to the exact implementation of the Set constructor, which I can't find and whether it needs to traverse the Array in order to create a Set. I'm aware that Set is implemented as an ordered hash table of some sort.

13
  • 1
    why not a for loop? it has no overheads ... Commented May 31, 2024 at 9:41
  • 2
    The second one is doing two operations, which combined will probably be slower than a single array operation either way. If you want an apple to apples comparison, only time array.forEach and set.forEach, not the set construction as well. Then you'll be able to say whether set iteration is faster or not. If you're not working with sets anyway and the set construction must be factored into the runtime, well, then it's probably not worth it. Commented May 31, 2024 at 9:49
  • 1
    Why would sets be faster to iterate over? Commented May 31, 2024 at 9:51
  • 3
    Sets are good for lookups. usually the overhead of creating the structure doesn't justify a one time use. Since there are also differences in creation speed of the set based on the content it's hard to give a general answer Commented May 31, 2024 at 11:00
  • 2
    "some evidence, such as stackoverflow.com/a/75514478, states that set traversals are much faster than array traversals" - no it does not. It shows that the overhead of ArraySet is about 10% for less than 100 elements (which are tiny sets where performance hardly matters). The answer concludes that "a for-of loop [over an Array] is now always faster than a Set. The values iterator is roughly equal speed."! Commented May 31, 2024 at 12:36

3 Answers 3

4

V8 developer here.

is it faster to create and iterate over a Set (from an Array) than an Array

I see no reason why that would be the case, and your own measurements don't indicate so either. So no, it's safe to assume that iterating over an array directly is faster than first creating a set and then iterating over that.

Just because someone on the internet claims otherwise (or you read their somewhat unclear statements as possibly claiming otherwise) doesn't make it so.

it boils down to the exact implementation of the Set constructor, which I can't find and whether it needs to traverse the Array in order to create a Set

The exact implementation of the Set constructor doesn't matter. Either way, it's obvious that it needs to copy all array elements into the set. That's extra work (cost-wise on the order of a full iteration), which is totally unnecessary if you only want to iterate.


For clarification, the one clear case where Set has a performance advantage is finding one specific item among a large collection. Array methods like .find and .indexOf need to iterate, which could get lucky and find the right element immediately, but "on average" needs to iterate over half the array's elements and over all of them if the value is not a member. Set methods like .has are required by the specification to have faster implementations than that; typically they do a hash-based "constant-time" lookup; in simple terms that means they only need to look in one specific place whether the element in question is there or not.
That said, even this difference only becomes important for pretty large collections; it's a common mistake among human programmers to intuitively think they should use a Set whenever they have more than two or three elements, whereas in truth even dead-simple iteration-based implementations tend to scale pretty well (i.e. end up being at least as fast, or even faster) up to several dozen elements.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the extensive answer, I felt this was the case but just wanted to reconfirm.
3

I'm guessing it boils down to whether the Set constructor needs to traverse the Array in order to create a Set.

Yes it does traverse the array. Essentially it is implemented as

class Set {
    constructor(src) {
        if (src != null) {
            for (const el of src) {
                this.add(el);
            }
        }
    }
    …

So creating a set by traversing the array, then traversing the set, will always be slower than directly traversing the array.

This approach only might be worth it (for overall runtime) when the set will eliminate some duplicates so the body of the loop over the set gets executed less often. Whether that is significant depends on the number of iterations you can save, what the body does exactly for each element, and how that compares to the overhead of the set creation.

Comments

1

Creating a set is expensing so creating a set + iteration always much slower than iterating an array. On the other hand only iteration of a set is faster insignificantly with arrays of 100k items and more. So no practical.

The fastest way for all sizes is to use a for loop:

` Chrome/125
---------------------------------------------------------------------------------------------------
>                           n=1000       |      n=10000      |     n=100000     |     n=1000000    
for loop               ■ 1.00x   x1m 235 | ■ 1.00x x100k 227 | ■ 1.00x x10k 225 |  ■ 1.00x  x1k 222
Set#forEach isolated    13.36x x100k 314 |  13.88x  x10k 315 |  14.36x  x1k 323 |   16.26x x100 361
Array#forEach          ■ 1.00x   x1m 235 |   1.26x x100k 286 |  20.62x  x1k 464 |   22.48x x100 499
Set#forEach            123.83x  x10k 291 | 222.91x   x1k 506 | 346.22x x100 779 | 1081.08x   x1 240
--------------------------------------------------------------------------------------------------- `

Open in the playground

let $length = 1000; 
const array = Array.from({length: $length}, () => Math.random());

// @benchmark Array#forEach
array.forEach(el => {});

// @benchmark Set#forEach
const set = new Set(array); 
set.forEach(el => {});

// @benchmark Set#forEach isolated
{
const set = new Set(array); 
// @run
set.forEach(el => {});
}

// @benchmark for loop
for(let i = 0; i < array.length; i++) array[i];

/*@skip*/ fetch('https://cdn.jsdelivr.net/gh/silentmantra/benchmark/loader.js').then(r => r.text().then(eval));

1 Comment

Thanks for the callout on the test itself - I did use different array elements in the actual tests.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.