Usually, when creating an numpy array of strings, we can do something like
import numpy as np
np.array(["Hello world!", "good bye world!", "whatever world"])
>>> array(['Hello world!', 'good bye world!', 'whatever world'], dtype='<U15')
Now the question is, I am given a long bytearray from a foreign C function like this:
b'Hello world!\x00<some rubbish bytes>good bye world!\x00<some rubbish bytes>whatever world\x00<some rubbish bytes>'
It is guaranteed that every 32 bytes is a null-terminated string (i.e., there is a \x00 byte appended to the valid part of the string) and I need to convert this long bytearray to something like this, array(['Hello world!', 'good bye world!', 'whatever world'], dtype='<U15'), preferably in-place (i.e., no memory copy).
This is what I do now:
for i in range(str_count):
str_arr[i] = byte_arr[i * 32: (i+1) * 32].split(b'\x00')[0].decode('utf-8')
str_arr_np = np.array(str_arr),
It works, but it is kind of awkward and not done in-place (bytes are copied at least once, if not twice). Are there any better approaches?
memset()'ed to before use.