0

I'm trying to implement a function in my class that calculates information from array A and outputs the result in array B. Array A and array B are both variables of a class, as is the function. Something along these lines:

class MyClass:
    def __init__(self, A):
        self.A = A
        self.B = np.zeros((A.shape[0], A.shape[1])

    def my_function(self, i):
        self.B += self.A[i]

    def main(self):
        for i in range(A.shape[2]):
            my_function(i)

example = np.random.rand(256, 256, 1000)
my_class = MyClass(example)
my_result = my_class.B

Obviously this function is oversimplified but the question revolves about how to use multiprocess with variables self.A and self.B. I've tried something like this but it didn't work at all:

class MyClass:
    def __init__(self, A):
        self.A = A
        self.B = np.zeros((A.shape[0], A.shape[1])

    def my_function(self, i):
        return self.A[i]

    def main(self):
        with multiprocessing.Pool() as p:
            position = range(self.A.shape[2])
            for i, result in enumerate(p.map(my_function, position)) 
                self.B += result

1 Answer 1

1

You can get your example code to work doing something like...

class MyClass:
    def __init__(self, A):
        self.A = A
        self.B = np.zeros((A.shape[0], A.shape[1]))

    def my_function(self, i):
        return self.A[:,:,i]

    def run(self):
        with Pool() as p:
            position = range(self.A.shape[2])
            for result in p.imap(self.my_function, position, chunksize=self.A.shape[2]):     
                self.B += result

example = np.random.rand(256, 256, 1000)
my_class = MyClass(example)
st = time.time()
my_class.run()
print(time.time() - st)

The problem with multiprocessing is that it has to fork new processes and then serialize (via pickle) the data going into and out of them. For simple code like this, the overhead is much more than the actual function you're completing.

Setting chunksize to the size of your iterable is just a way to assure that python doesn't fork process pools more than once and thus reduce the overhead. For this example the multiprocessed code is still slower than doing it single process, however if you have a more complex function, the MP version could be faster.

As a general rule, I try to never put the multiprocessed function/data inside of the class. This leads to a lot of extra overhead in the fork/pickle/unpickle process. You can move your function outside with something like...

# Simple gobal data / function
data = None
def my_function(i):
    global data
    return data[:,:,i]

class MyClass:
    def __init__(self, A):
        global data
        data   = A   # move class data to global
        self.A = A
        self.B = np.zeros((A.shape[0], A.shape[1]))

    def run(self):
        with Pool() as p:
            position = range(self.A.shape[2])
            for result in p.imap(my_function, position, chunksize=self.A.shape[2]):
                self.B += result


example = np.random.rand(256, 256, 1000)
my_class = MyClass(example)
st = time.time()
my_class.run()
print(time.time() - st)

For a simple function like this multiprocessing will still be slower, but if your actual function has a lot of complexity this can speed things up.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this was really helpful! My function is quite complex so I hope it will help it run faster. Quick questions: 1) Will this call the same memory addresses for each thread or is it using multiple copies? I'm a bit memory-constrained. 2) Is the chunksize argument generally important? I'm quite new to multiprocessing so I'm not familiar with the performance impact different chuncksizes will have
Pool "forks", which creates a virtual memory copy for each pool worker. The virtual memory only consumes real memory page(s) when/if you write to them, so memory use depends on the code. If you have trouble with memory usage continually growing, try pool "maxtasksperchild". There's info on SO about using this param. Most of the time you can omit "chunksize" but it can be used to help optimize your code sometimes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.