1

I would like a C++ function return a numeric array (uint8_t* converted from a vector) into a python bytes object. I don't know the length of this array ahead of time. This is numeric data, not a C string that would naturally have a null terminator. I'm having trouble finding any documentation on how to handle the length of the returned C pointer. Presumably, I can't just cast it to bytes since there's no length information in the bytes constructor. I'd like to avoid a really slow for loop.

C++:

uint8_t* cfunc(uint32_t* length_of_return_value); // returns an arbitrary-length numeric vector

Python:

import ctypes as c
c.cdll.LoadLibrary("file.so")
clib = c.CDLL("file.so")
clib.cfunc.argtypes = [c.POINTER(c.c_uint32)]
clib.cfunc.restype = c.POINTER(c.c_uint8)
length_of_return_value = c.c_uint32(0)
x = clib.cfunc(length_of_return_value)
# now what?
assert type(x) in {bytes, list}, "I need this to be a type that's convertible to numpy array"
2
  • Is there a reason you don't want to make use of numpy? That makese use of arrays by default. Also You can do calculations directly with the whole array Commented Jul 1 at 18:07
  • Yes, in practice this is some complex custom code going on in c++; I use numpy lots of places but it doesn't make sense here. Commented Aug 14 at 16:29

2 Answers 2

2

There are a number of methods. Here are a few:

  • Use ctypes.string_at with its size parameter when the returned value isn't null-terminated. It returns a bytes object as a copy of the original memory.

  • Slice the returned pointer to the returned size. This returns a list of Python int representing the byte values in the array as a new Python object.

  • Use numpy.ctypes.as_array and specify the shape parameter. This returns a numpy.ndarray object and shares the original data buffer instead of making a copy. Use this method if concerned about performance.

Since you mentioned "I need this to be a type that's convertible to numpy array" I recommend the last method above and it will already be a numpy array.

Make sure that if the returned value is an allocated array to free the memory when you are through with it. If using the first two methods above, you can free the memory immediately after making the copy, but if using the last shared memory method, do not free the memory until you are done with the numpy array.

Working example (Windows):

test.c:

#include <stdint.h>
#include <stdlib.h>

__declspec(dllexport)
uint8_t* cfunc(uint32_t* psize) {
    uint8_t* retval = malloc(5);
    for(uint8_t i = 0; i < 5; ++i)
        retval[i] = i;
    *psize = 5;
    return retval;
}

__declspec(dllexport)
void cfree(uint8_t* p) {
    free(p);
}

test.py:

import ctypes as ct
import numpy as np

clib = ct.CDLL('./test')
clib.cfunc.argtypes = ct.POINTER(ct.c_uint32),
clib.cfunc.restype = ct.POINTER(ct.c_uint8)
clib.cfree.argtypes = ct.POINTER(ct.c_uint8),
clib.cfree.restype = None

size = ct.c_uint32(0)
retval = clib.cfunc(size)

# Using string_at:
result = ct.string_at(retval, size.value)
print(result)  # bytes

# slicing to correct size
print(retval[:size.value])  # list of int

# Shared buffer using numpy
result = np.ctypeslib.as_array(retval, shape=(size.value,))
print(result)  # numpy array
result[0] = 7  # modifying numpy array...
print(result)  # numpy array
print(retval[0])  # ... also modifies original data

clib.cfree(retval)

Output:

b'\x00\x01\x02\x03\x04'
[0, 1, 2, 3, 4]
[0 1 2 3 4]
[7 1 2 3 4]
7
Sign up to request clarification or add additional context in comments.

Comments

1

Listing [Python.Docs]: ctypes - A foreign function library for Python.

Since the return value is not a NUL terminated string (meaning it might contain NULs), one way (although a bit complicated, but not relying on other modules) would be to use an auxiliary array (that retains size information).
If the array is allocated in that function, it must also be freed (to avoid memory leaks).

  • dll00.c:

    #include <stdint.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    #if defined(_WIN32)
    #  define DLL00_EXPORT_API __declspec(dllexport)
    #else
    #  define DLL00_EXPORT_API
    #endif
    
    #define MAX_SIZE 64
    
    #if defined(__cplusplus)
    extern "C" {
    #endif
    
    DLL00_EXPORT_API uint8_t* allocateArray(uint32_t *pSize);
    DLL00_EXPORT_API void freeArray(uint8_t *pArr);
    
    #if defined(__cplusplus)
    }
    #endif
    
    
    uint8_t* allocateArray(uint32_t *pSize)
    {
        srand(time(NULL));
        uint32_t size = rand() % MAX_SIZE;
        uint8_t *pArr = malloc(size);
        for (uint32_t i = 0; i < size / 2; ++i) {
            pArr[i * 2] = i % 0x100;
            pArr[i * 2 + 1] = 0;
        }
        if (pSize) {
            *pSize = size;
        }
        return pArr;
    }
    
    
    void freeArray(uint8_t *pArr)
    {
        free(pArr);
    }
    
  • code00.py:

    #!/usr/bin/env python
    
    import ctypes as cts
    import sys
    
    
    UI8Ptr = cts.POINTER(cts.c_uint8)
    
    DLL_NAME = "./libdll00.{:s}".format("dll" if sys.platform[:3].lower() == "win" else "so")
    
    
    def main(*argv):
        dll = cts.CDLL(DLL_NAME)
        allocate_array = dll.allocateArray
        allocate_array.argtypes = (cts.POINTER(cts.c_uint32),)
        allocate_array.restype = UI8Ptr
    
        free_array = dll.freeArray
        free_array.argtypes = (UI8Ptr,)
        free_array.restype = None
    
        size = cts.c_uint32(0)
        ptr = allocate_array(cts.byref(size))
        print(f"Pointer ({size.value}): ", end="")
        for i in range(size.value):
            print(f"{ptr[i]}, ", end="")
        print()
        ArrayType = cts.c_uint8 * size.value
        arr = ArrayType.from_address(cts.addressof(ptr.contents))  # No (unnecessary) memory copy
        print("Array: ", end="")
        for c in arr:
            print(f"{c}, ", end="")
        print()
        b = bytes(arr)
        free_array(ptr)
        print(b)
    
    
    if __name__ == "__main__":
        print(
            "Python {:s} {:03d}bit on {:s}\n".format(
                " ".join(elem.strip() for elem in sys.version.split("\n")),
                64 if sys.maxsize > 0x100000000 else 32,
                sys.platform,
            )
        )
        rc = main(*sys.argv[1:])
        print("\nDone.\n")
        sys.exit(rc)
    

Output:

(qaic-env) [cfati@cfati-5510-0:/mnt/e/Work/Dev/StackExchange/StackOverflow/q079686423]> ~/sopr.sh
### Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ###

[064bit prompt]>
[064bit prompt]> ls
code00.py  dll00.c
[064bit prompt]>
[064bit prompt]> gcc -shared -m64 -fPIC -o libdll00.so dll00.c
[064bit prompt]>
[064bit prompt]> ls
code00.py  dll00.c  libdll00.so
[064bit prompt]>
[064bit prompt]> python ./code00.py
Python 3.8.10 (default, Mar 18 2025, 20:04:55) [GCC 9.4.0] 064bit on linux

Pointer (49): 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 0,
Array: 0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 0,
b'\x00\x00\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00\x06\x00\x07\x00\x08\x00\t\x00\n\x00\x0b\x00\x0c\x00\r\x00\x0e\x00\x0f\x00\x10\x00\x11\x00\x12\x00\x13\x00\x14\x00\x15\x00\x16\x00\x17\x00\x00'

Done.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.