3

After calling the function test, I print the dtr1 array. I am expecting to get 100 for all the elements, but I am not getting it. why is that?

#include "ImageUtil2D.h"
#define W 10
#define H 10
#define MAX 100000
#define No_THREADS 10
surface<void,2> surfD;

__global__ void test()
{
for(int i=0;i<W;i++)
    for(int j=0;j<H;j++)
    {
        float a=100;
        surf2Dwrite(a, surfD, i,j, cudaBoundaryModeTrap);
    }
}

int main()
{
int *image = new int[W*H];
float *dtr = new float[W*H];
ImageUtil2D::InitImg(image, dtr, W, H);
const size_t sizef = size_t(W*H)*sizeof(float);

cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc(32, 0, 0, 0, cudaChannelFormatKindFloat);
cudaArray* cuArrD;
cudaMallocArray(&cuArrD, &channelDesc, W*H, 0, cudaArraySurfaceLoadStore);
//cudaMemcpyToArray(cuArrD, 0, 0, dtr, sizef, cudaMemcpyHostToDevice);
cudaBindSurfaceToArray(surfD, cuArrD);

test<<<1, 1>>>();

float *dtr1=new float[W*H];
cudaMemcpyFromArray(&dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );
ImageUtil2D::Print(dtr1);
return 0;
}
1
  • Add the error handling code around the cuda api calls and post where and how it fails. Commented Jun 16, 2011 at 20:03

2 Answers 2

5

CUDA C Programming Guide 3.2. Section: 3.2.4.2.2 Surface Binding

Unlike texture memory, surface memory uses byte addressing. This means that the x-coordinate used to access a texture element via texture functions needs to be multiplied by the byte size of the element to access the same element via a surface function.

Try this:

surf2Dwrite(a, surfD, i * 4, j, cudaBoundaryModeTrap);

Hope this help.

Suggestion: Read the whole chapter about Surface Memory or you will get Read/Write Coherency problems before you excepted ;)

Sign up to request clarification or add additional context in comments.

1 Comment

Oh, my apologies, i was almost sure. I've made a quick test and there is an extrange issue with cudaMemcpyFromArray(&dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );
1

The additional issue pointed out by pQB in the comment to his own answer on

cudaMemcpyFromArray(&dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );

can be fixed by changing the above line to

cudaMemcpyFromArray(dtr1, cuArrD, 0, 0, sizef, cudaMemcpyDeviceToHost );

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.