For this question, I refer to the example in Python docs discussing the "use of the SharedMemory class with NumPy arrays, accessing the same numpy.ndarray from two distinct Python shells".
A major change that I'd like to implement is manipulate array of class objects rather than integer values as I demonstrate below.
import numpy as np
from multiprocessing import shared_memory
# a simplistic class example
class A():
def __init__(self, x):
self.x = x
# numpy array of class objects
a = np.array([A(1), A(2), A(3)])
# create a shared memory instance
shm = shared_memory.SharedMemory(create=True, size=a.nbytes, name='psm_test0')
# numpy array backed by shared memory
b = np.ndarray(a.shape, dtype=a.dtype, buffer=shm.buf)
# copy the original data into shared memory
b[:] = a[:]
print(b)
# array([<__main__.Foo object at 0x7fac56cd1190>,
# <__main__.Foo object at 0x7fac56cd1970>,
# <__main__.Foo object at 0x7fac56cd19a0>], dtype=object)
Now, in a different shell, we attach to the shared memory space and try to manipulate the contents of the array.
import numpy as np
from multiprocessing import shared_memory
# attach to the existing shared space
existing_shm = shared_memory.SharedMemory(name='psm_test0')
c = np.ndarray((3,), dtype=object, buffer=existing_shm.buf)
Even before we are able to manipulate c, printing it will result in a segmentation fault. Indeed I can not expect to observe a behaviour that has not been written into the module, so my question is what can I do to work with a shared array of objects?
I'm currently pickling the list but protected read/writes add a fair bit of overhead. I've also tried using Namespace, which was quite slow because indexed writes are not allowed. Another idea could be to use share Ctypes Structure in a ShareableList but I wouldn't know where to start with that.
In addition there is also a design aspect: it appears that there is an open bug in shared_memory that may affect my implementation wherein I have several processes working on different elements of the array.
Is there a more scalable way of sharing a large list of objects between several processes so that at any given time all running processes interact with a unique object/element in the list?
UPDATE: At this point, I will also accept partial answers that talk about whether this can be achieved with Python at all.
closeandunlinkthe shared objects at the right time but that was left out because it wasn't completely relevant.