I want to split an Eigen dynamic-size array by columns evenly over OpenMP threads.
thread 0 | thread 1 | thread 2
[[0, 1, 2], [[0], | [[1], | [[2],
[3, 4, 5], becomes: [3], | [4], | [5],
[6, 7, 8]] [6]] | [7]] | [8]]
I can use the block method to do that, but I am not sure if Eigen would recognize the subarray for each thread occupies contiguous memory.
When I read the documentation of block type, has an InnerPanel template parameter with the following description:
InnerPanelis true, if the block maps to a set of rows of a row major matrix or to set of columns of a column major matrix (optional). The parameter allows to determine at compile time whether aligned access is possible on the block expression.
Does Eigen know that vectorization over the subarray for each OpenMP thread is possible because each subarray actually occupies contiguous memory?
If not, how to make Eigen know this?
Program:
#include <Eigen/Eigen>
#include <iostream>
int main() {
// The dimensions of the matrix is not necessary 8 x 8.
// The dimension is only known at run time.
Eigen::MatrixXi x(8,8);
x.fill(0);
int n_parts = 3;
#pragma omp parallel for
for (int i = 0; i < n_parts; ++i) {
int st = i * x.cols() / n_parts;
int en = (i + 1) * x.cols() / n_parts;
x.block(0, st, x.rows(), en - st).fill(i);
}
std::cout << x << "\n";
}
Result (g++ test.cpp -I<path to eigen includes> -fopenmp -lgomp):
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2
0 0 1 1 1 2 2 2