0

I have an operation that will repeatedly be called multiple times per second(may be ten thousands) of times which required use of a large 2D array. Each operation is independent of each other. Is there a performance difference between keeping it as a local variable vs a global variable? Does repeated allocation and deallocation of the 2D array incur a performance cost vs its advantages?

class ProcessData {
    void update(Data& data) {
       std::array<std::array<int, 10000>, 10000> matrix;
    }
}
2
  • What did your profiler say? Also, on ItaniumABI (x86, x86_64 for most compilers) this is a simple stack ptr adjustment, so a few clock cycles maybe. Commented Mar 12, 2020 at 14:01
  • "..repeated allocation and deallocation of the 2D array incur a performance cost vs its advantages?" what advantages? Is there any? Commented Mar 12, 2020 at 14:04

2 Answers 2

4

Global variables are generally speaking completely unnecessary, so let's not even go there - anything you can do using them, can be done by passing around a reference to a context object when new objects are constructed.

Since the operations are mutually independent, you'll want to parallelize them, so you have only three choices that will perform well: a class member variable, a thread-local static variable, or an automatic variable. The array is 400MB in size (10e3^2*4=100e6*4), so it simply won't work as an automatic variable - you'll usually run out of stack.

Thus:

class ProcessData {
public:
  static constexpr int N = 10000;
  using Matrix = std::array<std::array<int, N>, N>;
  void update(Data &data) {
    thread_local static Matrix matrix;
    // ...
  }
};

The downside is that depending on the C++ runtime implementation, the matrix may be allocated on startup of each and every thread, and you may not wish that to be the case when 400MB is at stake.

Thus, you might wish to allocate it only on demand:

// .h
class ProcessData {
public:
  static constexpr int N = 10000;
  using Matrix = std::array<std::array<int, N>, N>;
private:
  thread_local static std::unique_ptr<Matrix> matrix;
public:
  void update(Data &data) {
    if (!matrix) matrix.reset(new Matrix);
    //...
  }
};

// .cpp
thread_local std::unique_ptr<ProcessData::Matrix> ProcessData::matrix;

The matrix will be deallocated whenever the thread ends (e.g. a worker thread in a thread pool), but can also be deallocated explicitly: matrix.reset();

Sign up to request clarification or add additional context in comments.

Comments

0

First, yes reallocating an array (especially one of that size) on every call to update() will incur a significant performance cost. To remedy this, you could simply change matrix to a static local variable. This way it is not deallocated at the end of each function call. This however will mean that matrix is shared by all instances of ProcessData.

If this is an issue, you could also simply make this a member variable of ProcessData. Then each instance will have it's own matrix

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.