1

I am writing a cuda kernel which requires me to allocate an array of aligned struct on the device. I am getting the correct results from my computations and I need to write the values to this array starting from index 0.

When I try to write to this array and display the results back to host side, some of the answers are displayed as zero.

Clearly, I am not increasing the index as per my requirement. I tried using counter which I increase using atomicAdd(), however I still get some values as zero.

To be precise, I may use 1000 threads in my kernel for computations but my output allocated array can have a size less than 100 or more than 10000.

My question is, how do I make all these threads write the value to exactly one location of array ( as they are calculated ) and increment the array index/counter by 1 without overwriting it.

Any help will be appreciated.Thanks in advance.

1
  • Some code illustrating what the kernel does would be very helpful. Commented Jul 18, 2012 at 6:30

1 Answer 1

4

You can use atomicAdd(). It returns the old value, so you use that value as the index:

old_i = atomicAdd(&i, 1);
out_array[old_i] = val

However, you will get poor performance if many of your threads write out results, as the atomicAdd() will (indirectly) serialize all the writes. In that case, you should let each thread write its result,if any, to a slot set aside for that thread and then use a compaction algorithm (see thrust::copy_if), to gather up the results.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks...This solved my problem.I was using atomicAdd() after assigning the values to array.And yes,it is expensive for my computations. I am looking for other techniques that might speed my computations while utilizing the memory more efficiently.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.