How to pass device function as an input argument to host-side function?

Question

I just want to pass device function as argument of a host function, of cause, the host function then can launch some kernels with this device side function.

I tried the usual C++ way (pass by pointer/reference) and the CUDA debugger told me the kernel cannot launch.

Update:

What I want to do is:

__host__ void hostfunction(int a, int (*DeviceFunction)(int))
{
   /...do something.../
   somekernel<<<blocks, threads>>>(int * in, DeviceFunction);
}

And launch the host with:

hostfunction(x, &SomeDeviceFunctionTemplate<int>);

Your question is not entirely clear, at least to me. If you could post an example of what you tried, where the kernel did not launch, that might help. — Robert Crovella
– Robert Crovella, Commented Aug 28, 2013 at 13:57
Since __host__ functions can't take the addresses of __device__ functions, you basically need to write a short __global__ function that takes the address of the __device__ function of interest, and then stores it to memory. Your __host__ function can then read that function pointer from memory and then pass it to somekernel. — Jared Hoberock
– Jared Hoberock, Commented Aug 29, 2013 at 1:08

Robert Crovella · Accepted Answer · 2013-08-29 21:34:48Z

2

This example might be of interest:

$ cat t237.cu
#include <stdio.h>


__device__ int f1(){ printf("dev f1\n"); return 0;}
__device__ int f2(){ printf("dev f2\n"); return 0;}
__device__ int f3(){ printf("dev f3\n"); return 0;}

__device__ int *fptrf1 = (int *)f1;
__device__ int *fptrf2 = (int *)f2;
__device__ int *fptrf3 = (int *)f3;


__global__ void mykernel(int (*fptr)()){

  fptr();
  printf("executed\n");
}

int main(){

  int *hf1, *hf2, *hf3;
  cudaMemcpyFromSymbol(&hf1, fptrf1, sizeof(int *));
  cudaMemcpyFromSymbol(&hf2, fptrf2, sizeof(int *));
  cudaMemcpyFromSymbol(&hf3, fptrf3, sizeof(int *));
  mykernel<<<1,1>>>((int (*)())hf1);
  cudaDeviceSynchronize();
  mykernel<<<1,1>>>((int (*)())hf2);
  cudaDeviceSynchronize();
  mykernel<<<1,1>>>((int (*)())hf3);
  cudaDeviceSynchronize();
  return 0;
}
$ nvcc -arch=sm_20 -O3 -o t237 t237.cu
$ ./t237
dev f1
executed
dev f2
executed
dev f3
executed
[bob@cluster1 misc]$

I think this is roughly along the lines of what Jared was suggesting. As he mentioned, this will not be possible in host code:

&SomeDeviceFunctionTemplate<int>

Assuming SomeDeviceFunctionTemplate refers to a __device__ function.

edited Aug 29, 2013 at 21:34

answered Aug 29, 2013 at 21:29

Robert Crovella

154k12 gold badges254 silver badges300 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

einpoklum Over a year ago

Can I adapt this somehow for templated code instead the code you have in main()? Or is there no way around explicitly defining the device-side pointer for each and every function I plan to use?

Robert Crovella Over a year ago

some examples linked here use templating.

einpoklum Over a year ago

So, this example replaces the need for a global pointer variable with the need for a global setup-kernel which copies the device function's pointer to a location of the launcher's choosing. That doesn't help much... reading on. There's the question about kernel addresses, which are usable. But how would I leverage that to let me get at the address of arbitrary device functions?

Tom · Accepted Answer · 2013-08-28 14:37:48Z

1

It'd be helpful if you could post an example of what you are trying to do, but one thing to check is that you are compiling and running on Fermi (sm_20) or later since older GPUs did not support non-inlined function calls.

Check the compute capability of your device (needs 2.0 or later) and check your nvcc command line (needs -arch=sm_20 or later, or the -gencode equivalent).

answered Aug 28, 2013 at 14:37

Tom

21.2k4 gold badges48 silver badges56 bronze badges

Collectives™ on Stack Overflow

How to pass device function as an input argument to host-side function?

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related