When to choose multiple loops over single loop with if-statements

Question

I have to process each element of an array the same way once and have to modify each by an unpredictable pattern afterwards aswell.

Is there any difference in performance between these snippets and why, if so?

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];

    if(nums[i] < 10){
        nums[i] = 0;
    }
}

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
}

for(unsigned int i = 0; i < nums.size(); ++i){
    if(nums[i] < 10){
        nums[i] = 0;
    }
}

Would a different approach like this improve anything?

std::vector<int> nums;
//fill nums array

std::vector<int> flags;
flags.resize(nums.size());

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
    flags[i] = nums[i] < 10;
}

for(unsigned int i = 0; i < flags.size(); ++i){
   nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
}

Is there any difference in performance ... What did your own tests show ? — High Performance Mark
– High Performance Mark, Commented Jul 7, 2017 at 14:22
Why downvote the question? The poster only wants to understand and learn. The question is clear. The code is properly formatted. — Kevin Le - Khnle
– Kevin Le - Khnle, Commented Jul 7, 2017 at 14:25
@LightnessRacesinOrbit 3 different approaches were mentioned in the question. It takes more experiences to write a performance test. — Kevin Le - Khnle
– Kevin Le - Khnle, Commented Jul 7, 2017 at 14:35
@Khnle-Kevin: Only the person with access to the PC in question can actually run the performance test. — Lightness Races in Orbit
– Lightness Races in Orbit, Commented Jul 7, 2017 at 14:37

Bathsheba · Accepted Answer · 2017-07-07 14:34:02Z

4

The first one has fewer statements (which is always a good rough guide: fewer similar statements often means better performance), but above everything else, it is far more readable.

Don't micro-optimise: leave that to the compiler.

If you are in any doubt check the generated assembly / profile the performance.

Finally (apologise, for I couldn't resist), if nums was a std::vector<unsigned>, then you could have written

for (std::size_t i = 0; i < nums.size(); ++i){
    nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
}

which could help the branch predictor.

edited Jul 7, 2017 at 14:34

answered Jul 7, 2017 at 14:25

Bathsheba

235k35 gold badges376 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

simulate Over a year ago

good thinking, but this code was just supposed to serve as an example to illustrate a pattern which can also be applied on more complex processing with larger elements.

Bathsheba Over a year ago

But on the final point, do have a read of this: stackoverflow.com/questions/7127759/branch-on-operator

Drise Over a year ago

Also, see stackoverflow.com/questions/11227809/… as to why helping the branch predictor is important.

Richard Hodges · Accepted Answer · 2017-07-07 15:54:47Z

Test results on my machine (100 million ints in the array):

checkpoint
starting test: approach1
test took: 84ms
starting test: approach2
test took: 190ms
starting test: approach3
test took: 529ms
starting test: Bathsheba's idea
test took: 61ms

Incidentally, writing idiomatic code is often the most efficient of all. clang, gcc et. al. optimisers are fantastic:

void approach5(std::vector<int> &nums) {

    auto filter = [](auto x) { return (x < 10) ? 0 : x; };

    auto grow = [filter](auto x) { return filter(x * x); };

    std::transform(begin(nums), end(nums), begin(nums), grow);
}

Here's the code.

#include <iostream>
#include <chrono>
#include <random>
#include <array>
#include <vector>
#include <algorithm>

auto constexpr test_size = std::size_t(100'000'000);
using namespace std::literals;

void approach1(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];

        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}


void approach2(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
    }

    for (unsigned int i = 0; i < nums.size(); ++i) {
        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}

void approach3(std::vector<int> &nums) {
    std::vector<int> flags;
    flags.resize(nums.size());

    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
        flags[i] = nums[i] < 10;
    }

    for (unsigned int i = 0; i < flags.size(); ++i) {
        nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
    }
}

void approach4(std::vector<int> &nums) {
    for (std::size_t i = 0; i < nums.size(); ++i) {
        nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
    }
}


auto test = [](auto &&name, auto &&approach, auto &&data) {
    std::cout << "starting test: " << name << std::endl;
    auto my_data = std::vector<int>(data.begin(), data.end());
    auto now = std::chrono::high_resolution_clock::now();
    approach(my_data);
    auto then = std::chrono::high_resolution_clock::now();
    auto diff = std::chrono::duration_cast<std::chrono::milliseconds>(then - now);
    std::cout << "test took: " << diff.count() << "ms" << std::endl;
};

std::array<int, test_size> test_data;

int main() {
    std::random_device rd;
    std::default_random_engine eng(rd());
    std::uniform_int_distribution<> dist(0, 100);

    std::generate(test_data.begin(), test_data.end(), [&]() { return dist(eng); });
    std::cout << "checkpoint" << std::endl;

    test("approach1", approach1, test_data);
    test("approach2", approach2, test_data);
    test("approach3", approach3, test_data);
    test("Bathsheba's idea", approach4, test_data);
}

Sherif.Mohamed · Accepted Answer · 2017-07-07 14:42:29Z

0

In Algorithm analysis First pattern: time complexity is size of array O(n) Second pattern: time complexity is max between 2 loops O(n) and O(n) so it will be O(n) also.

First and second patterns are a same.

Third pattern not improve any thing.

Conclusion: 3 patterns are techniques not more.

answered Jul 7, 2017 at 14:42

Sherif.Mohamed

194 bronze badges

Collectives™ on Stack Overflow

When to choose multiple loops over single loop with if-statements

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related