3

I have to process each element of an array the same way once and have to modify each by an unpredictable pattern afterwards aswell.

Is there any difference in performance between these snippets and why, if so?

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];

    if(nums[i] < 10){
        nums[i] = 0;
    }
}

std::vector<int> nums;
//fill nums array

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
}

for(unsigned int i = 0; i < nums.size(); ++i){
    if(nums[i] < 10){
        nums[i] = 0;
    }
}

Would a different approach like this improve anything?

std::vector<int> nums;
//fill nums array

std::vector<int> flags;
flags.resize(nums.size());

for(unsigned int i = 0; i < nums.size(); ++i){
    nums[i] *= nums[i];
    flags[i] = nums[i] < 10;
}

for(unsigned int i = 0; i < flags.size(); ++i){
   nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
}
5
  • 5
    Is there any difference in performance ... What did your own tests show ? Commented Jul 7, 2017 at 14:22
  • 3
    Why downvote the question? The poster only wants to understand and learn. The question is clear. The code is properly formatted. Commented Jul 7, 2017 at 14:25
  • 1
    @Khnle-Kevin: Where is the research effort? Commented Jul 7, 2017 at 14:29
  • 1
    @LightnessRacesinOrbit 3 different approaches were mentioned in the question. It takes more experiences to write a performance test. Commented Jul 7, 2017 at 14:35
  • 1
    @Khnle-Kevin: Only the person with access to the PC in question can actually run the performance test. Commented Jul 7, 2017 at 14:37

3 Answers 3

4

The first one has fewer statements (which is always a good rough guide: fewer similar statements often means better performance), but above everything else, it is far more readable.

Don't micro-optimise: leave that to the compiler.

If you are in any doubt check the generated assembly / profile the performance.

Finally (apologise, for I couldn't resist), if nums was a std::vector<unsigned>, then you could have written

for (std::size_t i = 0; i < nums.size(); ++i){
    nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
}

which could help the branch predictor.

Sign up to request clarification or add additional context in comments.

3 Comments

good thinking, but this code was just supposed to serve as an example to illustrate a pattern which can also be applied on more complex processing with larger elements.
But on the final point, do have a read of this: stackoverflow.com/questions/7127759/branch-on-operator
Also, see stackoverflow.com/questions/11227809/… as to why helping the branch predictor is important.
3

Test results on my machine (100 million ints in the array):

checkpoint
starting test: approach1
test took: 84ms
starting test: approach2
test took: 190ms
starting test: approach3
test took: 529ms
starting test: Bathsheba's idea
test took: 61ms

Incidentally, writing idiomatic code is often the most efficient of all. clang, gcc et. al. optimisers are fantastic:

void approach5(std::vector<int> &nums) {

    auto filter = [](auto x) { return (x < 10) ? 0 : x; };

    auto grow = [filter](auto x) { return filter(x * x); };

    std::transform(begin(nums), end(nums), begin(nums), grow);
}

Here's the code.

#include <iostream>
#include <chrono>
#include <random>
#include <array>
#include <vector>
#include <algorithm>

auto constexpr test_size = std::size_t(100'000'000);
using namespace std::literals;

void approach1(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];

        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}


void approach2(std::vector<int> &nums) {
    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
    }

    for (unsigned int i = 0; i < nums.size(); ++i) {
        if (nums[i] < 10) {
            nums[i] = 0;
        }
    }
}

void approach3(std::vector<int> &nums) {
    std::vector<int> flags;
    flags.resize(nums.size());

    for (unsigned int i = 0; i < nums.size(); ++i) {
        nums[i] *= nums[i];
        flags[i] = nums[i] < 10;
    }

    for (unsigned int i = 0; i < flags.size(); ++i) {
        nums[i] = (!flags[i]) * nums[i]; //!flags[i] is 0 if nums[i] < 10
    }
}

void approach4(std::vector<int> &nums) {
    for (std::size_t i = 0; i < nums.size(); ++i) {
        nums[i] *= (nums[i] <= 3 ? 0 : nums[i]);
    }
}


auto test = [](auto &&name, auto &&approach, auto &&data) {
    std::cout << "starting test: " << name << std::endl;
    auto my_data = std::vector<int>(data.begin(), data.end());
    auto now = std::chrono::high_resolution_clock::now();
    approach(my_data);
    auto then = std::chrono::high_resolution_clock::now();
    auto diff = std::chrono::duration_cast<std::chrono::milliseconds>(then - now);
    std::cout << "test took: " << diff.count() << "ms" << std::endl;
};

std::array<int, test_size> test_data;

int main() {
    std::random_device rd;
    std::default_random_engine eng(rd());
    std::uniform_int_distribution<> dist(0, 100);

    std::generate(test_data.begin(), test_data.end(), [&]() { return dist(eng); });
    std::cout << "checkpoint" << std::endl;

    test("approach1", approach1, test_data);
    test("approach2", approach2, test_data);
    test("approach3", approach3, test_data);
    test("Bathsheba's idea", approach4, test_data);
}

1 Comment

Thank you for this example of testing!
0

In Algorithm analysis First pattern: time complexity is size of array O(n) Second pattern: time complexity is max between 2 loops O(n) and O(n) so it will be O(n) also.

First and second patterns are a same.

Third pattern not improve any thing.

Conclusion: 3 patterns are techniques not more.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.