Segmentation Fault with 3D array

Question

I am trying to work with 3D arrays in CUDA (200x200x100).

The moment I change my z dimension (model_num) from 4 to 5, I get a segmentation fault. Why, and how can I fix it?

const int nrcells = 200;
const int nphicells = 200;
const int model_num = 5; //So far, 4 is the maximum model_num that works. At 5 and after, there is a segmentation fault

    __global__ void kernel(float* mgridb) 
{
    const unsigned long long int  i = (blockIdx.y * gridDim.x + blockIdx.x) * blockDim.x + threadIdx.x;

    if(tx >= 0 && tx < nphicells && ty >=0 && ty < nrcells && tz >= 0  && tz < model_num){
        //Do stuff with mgridb[i]
    }
}

int main (void)
{

    unsigned long long int size_matrices = nphicells*nrcells*model_num; 
    unsigned long long int mem_size_matrices = sizeof(float) * size_matrices;

    float *h_mgridb = (float *)malloc(mem_size_matrices);
    float mgridb[nphicells][nrcells][model_num];

    for(int k = 0; k < model_num; k++){
        for(int j = 0; j < nrcells; j++){
            for(int i = 0; i < nphicells; i++){
                mgridb[i][j][k] = 0;
            }
        }
    }
    float *d_mgridb;

    cudaMalloc( (void**)&d_mgridb, mem_size_matrices );
    cudaMemcpy(d_mgridb, h_mgridb, mem_size_matrices, cudaMemcpyHostToDevice);

    int threads = nphicells;
    uint3 blocks = make_uint3(nrcells,model_num,1);
    kernel<<<blocks,threads>>>(d_mgridb);
    cudaMemcpy( h_mgridb, d_mgridb, mem_size_matrices, cudaMemcpyDeviceToHost);
    cudaFree(d_mgridb);
    return 0;
}

Please pay a little more attention to formatting and content of code you post in questions. The code as you posted it was unnecessarily hard to read and contained unbalanced {}. — talonmies
– talonmies, Commented Jul 9, 2013 at 17:53

Robert Crovella · Accepted Answer · 2022-12-12 22:03:16Z

3

This is getting stored on the stack:

float mgridb[nphicells][nrcells][model_num];

Your stack space is limited. When you exceed the amount you can store on the stack, you are getting a seg fault, either at the point of allocation, or as soon as you try and access it.

Use malloc instead. That allocates heap storage, which has much higher limits.

None of the above has anything to do with CUDA. Furthermore its not unique or specific to "3D" arrays. Any large stack based allocation (e.g. 1D array) is going to have the same trouble.

You may also have to adjust how you access the array, but it's not difficult to handle a flattened array using pointer indexing.

Your code is actually strange looking, because you are creating an appropriately sized array h_mgridb using malloc and then copying that array to the device (into d_mgridb). It's not clear what purpose mgridb serves in your code. h_mgridb and mgridb are not the same.

edited Dec 12, 2022 at 22:03

answered Jul 9, 2013 at 17:39

Robert Crovella

154k12 gold badges254 silver badges300 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

John W. Over a year ago

This fixes the problem. I changed float mgridb[nphicells][nrcells][model_num] to float mgridb = (float *)malloc(mem_size_matrices). Also, in the initialization "for" loop, I referenced it as a 1D array, so mgridb[i + (jnphicells) + (knphicellsnrcells)] = 0. I can see what you're saying, it does not make much sense to have h_mgridb if I am already allocating mgridb.

Collectives™ on Stack Overflow

Segmentation Fault with 3D array

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related