Exchanging objects between processes while running Python

Question

I would like to create in Python a process that run constantly in parallell while the main execution of my code is running. It should provide a way to deal with the sequential execution of Python that prevent me to do an asynchronous execution.

So I would like that a function RunningFunc run while my main code is doing some other operation.

I tried to use the threading module. However the computation is not in parralell and RunningFunc is an highly intensive computation and slow down heavily my main code.

I also tried using the multiprocessing module and I guess this should be my answer using a multiprocessing.Manager() doing some computation on a first process while accessing via a shared memory the data computed over time. But I didn't figure out a way to do that.

For exemple the RunningFunc is incrementing the Compteur variable.

def RunningFunc(x):
    boolean = True
    Compteur = 0
    while boolean:
       Compteur +=1

While in my main code some computation are running and I call sometime (not necessarily each while other_bool iteration), the Compteur variable of RunningFunc.

other_bool = True
Value = 0
while other_bool:
   ## MAKING SOME COMPUTATION
    Value = Compteur # Call the variable compteur that is constantly running
   ## MAKING SOME COMPUTATION

It really depends on exactly what you want to share and how often and with what latency. The answer is probably different for a Numpy array, different for an atomic integer and different again for a list or dictionary. — Mark Setchell
– Mark Setchell, Commented Apr 11, 2022 at 22:36
I would like to share an Array, It will be of size [4x128x128] exactly. Also i didn't said that I'm working with Python 3.7 this will prevent from using the shareable object : multiprocessing.shared_memory . — Gabriel C
– Gabriel C, Commented Apr 12, 2022 at 12:45

tdelaney · Accepted Answer · 2022-04-11 22:39:31Z

2

There are many ways to do processing in child processes. Which is best depends on questions such as the size of the data to be shared verses the time spent in the calculation. Following is an example much like your simple increment of a variable, but flushed out to a slightly larger list of integers to highlight some of the issues you'll bump into.

A multiprocessing.Manager is a convenient way to share data among processes, but its not particularly fast because it needs to synchronize data among its processes. If the data you want to share is fairly modest and doesn't change that often, its a good choice. But I will just focus on shared memory here.

Most python objects cannot be created in shared memory. Things like the object header, reference count or the memory heap are not shareable. Some objects, notably numpy arrays can be shared, but that is a different answer.

What you can do, is serialize and write/read to shared memory. This could be done with any serialization mechanism, but converting to fundamental types via struct is a good way to do it.

That means that you have to write your code to save its data periodically. You also need to worry about synchronization if you are saving anything bigger than a single CPU level word to memory. The parent could read while the child is writing, giving you inconsistent data.

The following example shows one way to handle shared memory:

import multiprocessing as mp
import multiprocessing.shared_memory
import time
import struct

data_format = struct.Struct("3Q") # will share 3 longlong ints

def main():
    # lock keeps shared memory readers from getting intermediate data
    shared_lock = mp.Lock()
    shared = mp.shared_memory.SharedMemory(create=True, size=8*3)
    buf = shared.buf
    try:
        print(shared)
        child = mp.Process(target=running_func, args=(shared.name, shared_lock))
        child.start()
        try:
            print("read for 20 seconds")
            for i in range(20):
                with shared_lock:
                    my_list = data_format.unpack_from(buf, 0)
                print(my_list)
                time.sleep(1)
        finally:
            child.terminate()
            child.join()
    finally:
        shared.close()
        shared.unlink()


def running_func(shared_memory_name, lock):
    shared = mp.shared_memory.SharedMemory(name=shared_memory_name)
    buf = shared.buf
    try:
        my_list = [1,2,3]
        while True:
            my_list = [val+1 for val in my_list]
            with lock:
                data_format.pack_into(buf, 0, *my_list)
    finally:
        shared.close()

if __name__ == "__main__":
    main()

edited Apr 11, 2022 at 22:39

answered Apr 11, 2022 at 22:24

tdelaney

77.9k6 gold badges91 silver badges129 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Gabriel C Over a year ago

Thanks for the answer. Its pretty close from what I'm trying to implement. However I forgot to mention that I work on Python 3.7 for compatibility question between my packages. So I can't use multiprocessing.shared_memory. Should I replace multiprocessing.shared_memory by any shared object as 'multiprocessing.Array' for exemple? Thx

tdelaney Over a year ago

You can use the mmap module. If you pass -1 as a file descriptor, you'll get an "anonymous file" which is just a memory map without a backing file on the file system. You could also use a real file and still have the performance of memory map if you want to save the state when you are done. just use the mmap.MAP_SHARED option when opening.

Collectives™ on Stack Overflow

Exchanging objects between processes while running Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related