0

So, I have to pass some data to a OpenCL kernel using PyOpenCL or some workaround using Python. The data is readed in the kernel-side as a struct and I can't change the kernel cuz it is working fine and is a part of a much bigger project that my code must work with.

The kernel looks like that:

typedef struct VglClStrEl{ 
    float data[VGL_ARR_CLSTREL_SIZE];
    int ndim;
    int shape[VGL_ARR_SHAPE_SIZE];
    int offset[VGL_ARR_SHAPE_SIZE];
    int size;
} VglClStrEl;

typedef struct VglClShape{ 
    int ndim;
    int shape[VGL_ARR_SHAPE_SIZE];
    int offset[VGL_ARR_SHAPE_SIZE];
    int size;
} VglClShape;

__kernel void kernel(__global unsigned char* img_input, 
                     __global unsigned char* img_output,  
                     __constant VglClShape* img_shape,
                     __constant VglClStrEl* window)
{

    // do what is needed

}

So, as you can see, the VglClShape and VglClStrEl structures, have different type arrays and static-bitsize variables.

The [1] workaround supports structs with only one type arrays(or I tragically failed to get a way to do it with multiple array types).

The [2] workaround is the PyOpenCL documentation reference for how pass Python data to a OpenCL kernel struct. This approach don't support arrays at all.

So, how can I pass the python data as the OpenCL kernel can read? I already have all the data on Python-side, and I just need to know how to pass it from the Python to the kernel.

Before you ask: I am using Python 3 and I CAN NOT CHANGE THE KERNEL.

And yes, the array sizes are static. You can assume something like that:

VGL_ARR_CLSTREL_SIZE=256;
VGL_ARR_SHAPE_SIZE=20;

[1] Passing struct with pointer members to OpenCL kernel using PyOpenCL

[2] https://documen.tician.de/pyopencl/howto.html#how-to-use-struct-types-with-pyopencl

3
  • Can you use the standard ctypes module? Commented Jun 17, 2018 at 3:35
  • @PM2Ring As a know, ctypes is a Python foreign library to use C things on Python. What I need to do is the inverse: pass Python data to OpenCL in a way it understands (paying attention to memory allocation symmetry between both). Can I use ctypes to do something like that? Commented Jun 19, 2018 at 12:34
  • I don't know OpenCL, but I've used ctypes to interact with the OpenSSL library. See here for an example. And the final codeblock here has another example. Commented Jun 19, 2018 at 12:48

1 Answer 1

1

There is a hackish way to do this that requires some tedious byte wrangling. Presumably you are OK with deploying a small OpenCL probing kernel? (PyOpenCL does this under the hood for some ops in any case)

The basic idea is to:

  • find out how the OpenCL device aligns all the elements of your structs by running a single instance kernel
  • create a numpy byte array to match the size of the OpenCL struct
  • byte-wise copy each element of your Python struct into this array
  • when invoking your unchangeable OpenCL kernel, pass this array via a bag of bytes buffer

The following kernel does the job:

__kernel void get_struct_sizes( __global uint *struct_sizes )
{
    const uint global_id = get_global_id(0u)+get_global_id(1u)*get_global_size(0u);
    VglClStrEl vgclstrel;
    VglClShape vgclshape;
    uint offset;

    printf("In GPU (probing):\n Kernel instance = %d\n", global_id);

    if (global_id==0) {
        offset = (uint)&(vgclstrel.data);
        struct_sizes[0] = (uint)sizeof(vgclstrel);
        struct_sizes[1] = (uint)&(vgclstrel.ndim)-offset;
        struct_sizes[2] = (uint)&(vgclstrel.shape)-offset;
        struct_sizes[3] = (uint)&(vgclstrel.offset)-offset;
        struct_sizes[4] = (uint)&(vgclstrel.size)-offset;
        offset = (uint)&(vgclshape.ndim);
        struct_sizes[5] = (uint)sizeof(vgclshape);
        struct_sizes[6] = (uint)&(vgclshape.shape)-offset;
        struct_sizes[7] = (uint)&(vgclshape.offset)-offset;
        struct_sizes[8] = (uint)&(vgclshape.size)-offset;
    }
    return;
}

Execute this kernel and return struct_sizes into vgclshape_sizes, create this array:

img_shape  = np.zeros((vgclshape_sizes[0]), dtype=np.uint8)

and copy into it what you need:

def copy_into_byte_array(value, byte_array, offset):
        for i,b in enumerate(np.ndarray.tobytes(value)):
            byte_array[i+offset] = b
copy_into_byte_array(ndim,   img_shape, 0) 
copy_into_byte_array(shape,  img_shape, vgclshape_sizes[1]) 
copy_into_byte_array(offset, img_shape, vgclshape_sizes[2]) 
copy_into_byte_array(size,   img_shape, vgclshape_sizes[3]) 

I've skipped some steps here; filling them in you'll find this approach works. I was able to pass a demo struct to a dummy copy of your inviolate kernel.

I would be interested to hear if there are more elegant ways to do any/all of these steps. I would also expect there will be problems with endianness etc that would otherwise be transparent. With luck you can work around them.

Sign up to request clarification or add additional context in comments.

4 Comments

How did you passed the bag of bytes to the kernel? When I try to pass the byte bag, I get the error clSetKernelArg failed: INVALID_ARG_SIZE
Look in here: github.com/cstarknyc/PyPlayground/tree/master/Pass_Struct ...but basically: struct_sizes = np.zeros(9, dtype=np.uint32) struct_sizes_buffer = cl.Buffer(cl_state.context, COPY_WRITE, hostbuf=struct_sizes) buffer_list = [struct_sizes_buffer] cl_state.kernel.set_args(*buffer_list) cl_state.kernel.set_scalar_arg_dtypes( [None]*len(buffer_list) ) cl_state.event = cl.enqueue_nd_range_kernel(cl_state.queue, cl_state.kernel, global_size, local_size)
Finally got it working as expected. Thank you very much by your help Colin!
Cant yet. I don't have the 15 points of reputation. WHen I get them I'll vote you up

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.