0

I wanted to use thread id to access an array which is defined as a global variable. But I face the problem in summing by one. Take a look below:

// initial array myU[0..3]={0,0,0,0}, myindex[0..3]={0,1,1,3}
1- tid=0,1,2,3 //tid is threads index
2- id=myindex[tid]; //id=0,1,1,3
3- myU[id]=myU[id]+1; 
4- if (myU[id]>1)
     //print("id"); // it should print '1'

I supposed after running line 3 I have myU[0]=1,myU[1]=2,myU[3]=1. But myU array has some strange value, like as: myU[0]=0, myU[1]=1, myU[3]=3. I don't know why.

My final goal is to have the id(in line 4), which they summed by one, more than one time).

1 Answer 1

4

If myU[1] is written by 2 different threads then the result is undefined and you need to use atomicAdd to obtain myU[1]==2

CUDA programming guide states:

If a non-atomic instruction executed by a warp writes to the same location in global or shared memory for more than one of the threads of the warp, the number of serialized writes that occur to that location varies depending on the compute capability of the device and which thread performs the final write is undefined.

Sign up to request clarification or add additional context in comments.

2 Comments

I used atomicAdd(&myU[id],1), but I still get the same wrong rsualt!!
Then you should probably provide a complete, compilable code that demonstrates the problem. In fact, SO expects: "Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance. "

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.