3

When I try to reduce a large heap array with OpenMP reduction, it segment fault:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  double *test = NULL;
  size_t size = (size_t)1024 * 1024 * 16; // large enough to overflow openmp threads' stack

  test = malloc(size * sizeof(double));

#pragma omp parallel reduction(+ : test[0 : size]) num_threads(2) 
  {
    test[0] = 0;
#pragma omp critical
    {
      printf("frame address: %p\n", __builtin_frame_address(0));
      printf("test: %p\n", test);
    }
  }
  free(test);
  printf("Allocated %zu doubles\n\n", size);
}

Please note that double *test is allocated on heap, thus not a duplication of this and this.

This example works with small size array, but segment fault with large array. The array is allocated on heap, and the system memory is enough.

Simimar issue but segment fault still happens even when the array is allocated on heap.

There are same issue on other community:

https://community.intel.com/t5/Intel-Fortran-Compiler/Segmentation-fault-when-using-large-array-with-OpenMP/m-p/758829

https://forums.oracle.com/ords/apexds/post/segmentation-fault-with-large-arrays-and-openmp-1728

but all the solution I found is about increase openmp stack size.

3
  • 1
    if(test == NULL), check if malloc failed in the first place. Commented Feb 18 at 9:44
  • @Lundin on most systems, malloc provides a virtual allocation and will not fail. If the system runs out of memory, the failure will occur once a new page should be mapped when writing to the page. Commented Feb 18 at 15:10
  • @Joachim True but this is requesting hundreds of Mb so I wouldn't assume anything. Commented Feb 18 at 15:14

1 Answer 1

3

I thought there should be a real solution so I issue a bug on gcc's bugzilla:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118909

Thanks to Jakub Jelinek, the reason of this bug is, most compiler allocates privatized data on stack, which is good for performance. If you do need a large privatized data, you can either increase the OMP stack size by set OMP_STACKSIZE environment variable or use allocate clause to specify it should be allocated on heap.

So the solution is adding allocate(test) to make the privatized test array on heap:

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
  double *test = NULL;
  size_t size = (size_t)1024 * 1024 * 16;

  test = malloc(size * sizeof(double));

#pragma omp parallel reduction(+ : test[0 : size]) num_threads(2) allocate(test)
  {
    test[0] = 0;
#pragma omp critical
    {
      printf("frame address: %p\n", __builtin_frame_address(0));
      printf("test: %p\n", test);
    }
  }
  free(test);
  printf("Allocated %zu doubles\n\n", size);
}
Sign up to request clarification or add additional context in comments.

4 Comments

This part of the OpenMP specification is quite unclear. The allocate clause is supposed to be given an allocator: what if no allocator is specified, like here? How do we know that that the allocation is guaranteed on the heap?
> For all directives except the target directive, if no allocator is specified in the clause then the memory allocator that is specified by the def-allocator-var ICV will be used for the list items that are specified in the allocate clause. openmp.org/spec-html/5.0/openmpsu55.html
Sure, but then what if the def-allocator-var ICV is undefined (most of time it is)?
you are right, 2.5.2 ICV Initialization says the def-allocator-var is Implementation defined. gcc's default value is omp_default_mem_alloc, clang does not specify it in manual. What memory allocator should I use? I read the specification and the omp_large_cap_mem_alloc seems suitable.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.