1

I am trying to convert a list of python strings to a 2D character array, and then pass it into a C function.

Python version: 3.6.4, Cython version: 0.28.3, OS Ubuntu 16.04

My first try looks like this:

def my_function(name_list):
    cdef char name_array[50][30]

    for i in range(len(name_list)):
        name_array[i] = name_list[i]

The code builds, but during runtime I receive the following response:

Traceback (most recent call last):
  File "test.py", line 532, in test_my_function
    my_function(name_list)
  File "my_module.pyx", line 817, in my_module.my_function
  File "stringsource", line 93, in 
carray.from_py.__Pyx_carray_from_py_char
IndexError: not enough values found during array assignment, expected 25, got 2

I then tried to make sure that the string on the right-hand side of the assignment is exactly 30 characters by doing the following:

def my_function(name_list):
    cdef char name_array[50][30]

    for i in range(len(name_list)):
        name_array[i] = (name_list[i] + ' '*30)[:30]

This caused another error, as follows:

Traceback (most recent call last):
  File "test.py", line 532, in test_my_function
    my_function(name_list)
  File "my_module.pyx", line 818, in my_module.my_function
  File "stringsource", line 87, in carray.from_py.__Pyx_carray_from_py_char
TypeError: an integer is required

I will appreciate any help. Thanks.

4
  • it works if you do name_array[i] = bytearray((name_list[i]+'a'*30)[:30]). Somehow with str Cython decides that it needs an integer, not sure why though... Commented Jun 21, 2018 at 13:29
  • Not sure why you need char name_array[50][30], but I would not do it, if I don't absolutely have to. Commented Jun 21, 2018 at 13:32
  • @ead: This is enforced by the third-party C library I am calling. Unfortunately I don't have a say in the matter. Commented Jun 21, 2018 at 13:35
  • I would write the copying routine myself: it would be more efficient and clearer than this strange +30-business. You might also want to take \0-termination in consideration - Cython would not do it for you. Commented Jun 21, 2018 at 13:42

2 Answers 2

1

I don't like this functionality of Cython and seems to be at least not very well thought trough:

  • It is convenient to use char-array and thus to avoid the hustle with allocating/freeing of dynamically allocated memory. However, it is only natural that the allocated buffer is larger than the strings for which it is used. Enforcing equal lengths doesn't make sense.
  • C-strings are null-terminated. Not always is \0 at the end needed, but often it is necessary, so some additional steps are needed to ensure this.

Thus, I would roll out my own solution:

%%cython
from libc.string cimport memcpy

cdef int from_str_to_chararray(source, char *dest, size_t N, bint ensure_nullterm) except -1:
    cdef size_t source_len = len(source) 
    cdef bytes as_bytes = source.encode('ascii')    #hold reference to the underlying byte-object
    cdef const char *as_ptr = <const char *>(as_bytes)
    if ensure_nullterm:
        source_len+=1
    if source_len > N:
        raise IndexError("destination array too small")
    memcpy(dest, as_ptr, source_len)
    return 0

and then use it as following:

%%cython
def test(name):
    cdef char name_array[30]
    from_str_to_chararray(name, name_array, 30, 1)
    print("In array: ", name_array)

A quick test yields:

>>> tests("A")
In array: A
>>> test("A"*29)
In array: AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>> test("A"*30)
IndexError: destination array too small

Some additional remarks to the implementation:

  • it is necessary to hold the reference of the underlying bytes object, to keep it alive, otherwise as_ptr will become dangling as soon as it is created.
  • internal representation of bytes-objects has a trailing \0, so memcpy(dest, as_ptr, source_len) is safe even if source_len=len(source)+1.
  • except -1 in the signature is needed, so the exception is really passed to/checked in Python code.

Obviously, not everything is perfect: one has to pass the size of the array manually and this will leads to errors in the long run - something Cython's version does automatically right. But given the lacking functionality in Cython's version right now, the roll-out version is the better option in my opinion.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the help. I really appreciate it.
1

Thanks to @ead for responding. It got me to something that works. I am not convinced that it is the best way, but for now it is OK.

I addressed null termination, as @ead suggested, by appending null characters.

I received a TypeError: string argument without an encoding error, and had to encode the string before converting it to a bytearray. That is what the added .encode('ASCII') bit is for.

Here is the working code:

def my_function(name_list):
    cdef char name_array[50][30]

    for i in range(len(name_list)):
        name_array[i] = bytearray((name_list[i] + '\0'*30)[:30].encode('ASCII'))

1 Comment

sorry, I forgot you use Python3 - there is probably no need for bytearray(...) it should also work directly with bytes-object directly after encoding.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.