2

I would like to create in Python a process that run constantly in parallell while the main execution of my code is running. It should provide a way to deal with the sequential execution of Python that prevent me to do an asynchronous execution.

So I would like that a function RunningFunc run while my main code is doing some other operation.

I tried to use the threading module. However the computation is not in parralell and RunningFunc is an highly intensive computation and slow down heavily my main code.

I also tried using the multiprocessing module and I guess this should be my answer using a multiprocessing.Manager() doing some computation on a first process while accessing via a shared memory the data computed over time. But I didn't figure out a way to do that.

For exemple the RunningFunc is incrementing the Compteur variable.

def RunningFunc(x):
    boolean = True
    Compteur = 0
    while boolean:
       Compteur +=1

While in my main code some computation are running and I call sometime (not necessarily each while other_bool iteration), the Compteur variable of RunningFunc.

other_bool = True
Value = 0
while other_bool:
   ## MAKING SOME COMPUTATION
    Value = Compteur # Call the variable compteur that is constantly running
   ## MAKING SOME COMPUTATION
2
  • 1
    It really depends on exactly what you want to share and how often and with what latency. The answer is probably different for a Numpy array, different for an atomic integer and different again for a list or dictionary. Commented Apr 11, 2022 at 22:36
  • I would like to share an Array, It will be of size [4x128x128] exactly. Also i didn't said that I'm working with Python 3.7 this will prevent from using the shareable object : multiprocessing.shared_memory . Commented Apr 12, 2022 at 12:45

1 Answer 1

2

There are many ways to do processing in child processes. Which is best depends on questions such as the size of the data to be shared verses the time spent in the calculation. Following is an example much like your simple increment of a variable, but flushed out to a slightly larger list of integers to highlight some of the issues you'll bump into.

A multiprocessing.Manager is a convenient way to share data among processes, but its not particularly fast because it needs to synchronize data among its processes. If the data you want to share is fairly modest and doesn't change that often, its a good choice. But I will just focus on shared memory here.

Most python objects cannot be created in shared memory. Things like the object header, reference count or the memory heap are not shareable. Some objects, notably numpy arrays can be shared, but that is a different answer.

What you can do, is serialize and write/read to shared memory. This could be done with any serialization mechanism, but converting to fundamental types via struct is a good way to do it.

That means that you have to write your code to save its data periodically. You also need to worry about synchronization if you are saving anything bigger than a single CPU level word to memory. The parent could read while the child is writing, giving you inconsistent data.

The following example shows one way to handle shared memory:

import multiprocessing as mp
import multiprocessing.shared_memory
import time
import struct

data_format = struct.Struct("3Q") # will share 3 longlong ints

def main():
    # lock keeps shared memory readers from getting intermediate data
    shared_lock = mp.Lock()
    shared = mp.shared_memory.SharedMemory(create=True, size=8*3)
    buf = shared.buf
    try:
        print(shared)
        child = mp.Process(target=running_func, args=(shared.name, shared_lock))
        child.start()
        try:
            print("read for 20 seconds")
            for i in range(20):
                with shared_lock:
                    my_list = data_format.unpack_from(buf, 0)
                print(my_list)
                time.sleep(1)
        finally:
            child.terminate()
            child.join()
    finally:
        shared.close()
        shared.unlink()


def running_func(shared_memory_name, lock):
    shared = mp.shared_memory.SharedMemory(name=shared_memory_name)
    buf = shared.buf
    try:
        my_list = [1,2,3]
        while True:
            my_list = [val+1 for val in my_list]
            with lock:
                data_format.pack_into(buf, 0, *my_list)
    finally:
        shared.close()

if __name__ == "__main__":
    main()
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answer. Its pretty close from what I'm trying to implement. However I forgot to mention that I work on Python 3.7 for compatibility question between my packages. So I can't use multiprocessing.shared_memory. Should I replace multiprocessing.shared_memory by any shared object as 'multiprocessing.Array' for exemple? Thx
You can use the mmap module. If you pass -1 as a file descriptor, you'll get an "anonymous file" which is just a memory map without a backing file on the file system. You could also use a real file and still have the performance of memory map if you want to save the state when you are done. just use the mmap.MAP_SHARED option when opening.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.