1

I am wondering how to properly copy value of host variable directly to device variable

I tried to use cudaMemcpy but without any special results. I was getting only garbage or nothing.

Pixel_GPU* Device_Array{};
//__device__ size_t size{};
size_t size{};
cudaMalloc((void**)& Device_Array, global_size * sizeof(Pixel_GPU));
cudaMalloc((void**) size, sizeof(size_t));
cudaMemset(&size, 0, sizeof(size_t));

cudaMemcpy(Device_Array, Host_Array, global_size * sizeof(Pixel_GPU), HostToDevice);
cudaMemcpy(&size, &global_size, sizeof(size_t), HostToDevice);
_STD cout << global_size << NEW_LINE;
Show_Device_Variables <<<2, 1>>>(&size);

cudaFree(&size);
cudaFree(Device_Array);

free(Host_Array);

For instance: global_size may have size upon to 1 000 000 . Size_t is able to take it on, but the size of "size" (device array size) is still uninitialized

4
  • did you intialize the host array? You must do it before call Memcpy... Commented Aug 30, 2019 at 11:40
  • The sensible way to pass a scalar quantity is by value, not using cudaMalloc at all. You might want to study a basic CUDA sample code like vectorAdd. Commented Aug 30, 2019 at 14:48
  • Thanks guys for your advice's. I am so green in this matter but I learn it every day and I try to write a good cuda code. Thanks once again Commented Aug 30, 2019 at 18:38
  • Yes yes I initialized HostArray earlier, this is only a snippet from my whole code Commented Aug 30, 2019 at 18:39

1 Answer 1

2

You're passing the (indeterminate) value of size reinterpreted as a pointer to cudaMalloc.
This is not a good idea; CUDA is probably going to write the address of its freshly allocated device memory in some arbitrary place.
If you're lucky, it crashes, but if you're unlucky it might just seem like nothing happened.

All cudaMalloc calls follow the same pattern:

T* p;  // This is going to be a device pointer.
cudaMalloc((void**) &p, ... // Pass the address of the pointer.

so you should have

size_t* size{};
cudaMalloc((void**) &size, sizeof(size_t));
cudaMemset(size, 0, sizeof(size_t));
// ...
cudaMemcpy(size, &global_size, sizeof(size_t), HostToDevice);
Show_Device_Variables <<<2, 1>>>(size);

cudaFree(size);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.