I am experimenting with multiprocessing in Python, however, I am having trouble with creating some shared memory. Take the following example that illustrates my problem:
In reference to the following (slightly different as he uses a matrix full of floats, but same principle), I want to convert a numpy matrix of strings into a shared memory space for processes to use. I have the following:
from ctypes import c_wchar_p
import numpy as np
from multiprocessing.sharedctypes import Array
input_array = np.array([['Red', 'Green', 'Blue', 'Yellow'],
['Purple', 'Orange', 'Cyan', 'Pink']]).T
shared_memory = Array(c_wchar_p, input_array.size, lock=False) # Equivalent to just using a RawArray
np_wrapper = np.frombuffer(shared_memory, dtype='<U1').reshape(input_array.shape)
np.copyto(np_wrapper, input_array)
print(np_wrapper)
However, the np_wrapper only has the first character of each string:
[['R' 'P']
['G' 'O']
['B' 'C']
['Y' 'P']]
Things I have tried to rectify the problem:
- I tried changing the
dtypeof thefrombufferfunction from<U1to<U6, which is thedtypeof theinput_array. However, it throws the following exception:
ValueError: buffer size must be a multiple of element size
- I tried using a
dtypeofint64with thefrombufferfunction because myshared_memoryarray is of typec_wchar_p(i.e. string pointers) and I am on a 64-bit Windows 10 system. However, it throws the following exception:
ValueError: cannot reshape array of size 4 into shape (4,2)
I am extremely confused why my typing is wrong here. Does anyone have any insight on how to fix this problem?
input_array, the strings are represented asU6(24 bytes) items, all packed in the array'sdata_buffer. They don't reference strings elsewhere in memory, as they would in a list (or object dtype array). Check theinput_array.itemsize.