0

I meet a practical problem in my current work. I need to share a complex python object such as following structure between multiprocesses:

import numpy as np
from typing import List, Tuple

class ComplexClass:
    def __init__(
            self,
            u: np.ndarray,
            v: Tuple[Tuple[int, float]],
            w: str,
            x: int,
            y: List[Tuple[int, np.ndarray]],  # all numpy array here with same shape
            z: List[List[int]],  # inner list with same length
    ):
        self.u = u
        self.v = v
        self.w = w
        self.x = x
        self.y = y
        self.z = z

My server's RAM is 128G, one object above is ~50G. We need to use >=4 processes to handle above object to meet timeout requirement. I found this object will have 4 copies in 4 processes which are out of memory. Even I apply to run it in server with larger RAM, the overhead of copy objects is huge. In my current usecase, this object is read-only in 4 processes, therefore there is no data races and I don't need any lock.

The whole structure of problem is somehow like following:

from multiprocessing import Pool
import numpy as np
from typing import List, Tuple


class ComplexClass:
    ... # defined before


def handle(instance: ComplexClass, other_params):
    # main logic, we only read the value in instance,
    # and will not modify the instance
    ...


def main():
    instance = ...  # get the instance of ComplexClass
    params_list = [(instance, 0), (instance, 1), (instance, 2), (instance, 3)]
    with Pool(4) as pool:
        res = pool.starmap(handle, params_list)


if __name__ == '__main__':
    main()

My questions:

  1. Is there any way I can share one copy of above object in all different processes? I've read multiple posts here. Many posts are outdated and I only found how to share single value (use multiprocessing.Value), single list multiprocessing.Array or single numpy.array. How to share above generic complex object?

PS: I cannot use multithreading in my example, since the codebase in my current company is CPython as interpreter and it has GIL. I cannot change to other interpreter.

2
  • If you really only found Value and Array these may be helpful: stackoverflow.com/questions/47837206/… stackoverflow.com/questions/10721915/… . multiprocessing also has shared_memory in python 3.8+ but I'm not really familiar with how it works. Commented Dec 6, 2021 at 3:23
  • @MateusReis I'm not advanced user. I'm new to python multiprocessing. From the doc, I can only use in simple case. I don't know how to generalize it to this complex case. Thank you. Commented Dec 6, 2021 at 4:05

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.