I can alloc memory (cudaMalloc) on GPU for static multi dimensional arrays, for the ones declared as follows int b[size1][size2][size3][size4][size5]...;. How is it possible to alloc memory (cudaMalloc) for dynamic array on GPU, for example int ***a; (we can also assume higher dimensions), where a has all the sizes distinct? Assuming that a has been allocated of its dimension's sizes on CPU side. Simple example would be appreciated, thanks!
1 Answer
Use cudaMalloc to allocate memory dynamically. For high-dimensional arrays, just compute the total, flattened size and access the array in strides:
void * p;
cudaError_t e = cudaMalloc(&p, dim1 * dim2 * dim3 /* ... */);
if (e != cudaSuccess) { /* error! */ }
// Access
int * arr = p;
arr[i1 * dim2 * dim3 + i2 * dim3 + i3] = 2; // etc., in strides
(For 2- or 3-dimensional arrays, you may also like to use cudaMalloc3DArray.)
There's also a corresponding host version, cudaMallocHost, which allocates page-locked host memory that's directly accessible by the device.
cudaMallocdoes not allocate static arrays (it always and exclusively allocates dynamic memory). Don't confuse static objects with statically-known array sizes.