5

I have a problem, which is similar to this:

import numpy as np

C = np.zeros((100,10))

for i in range(10):
    C_sub = get_sub_matrix_C(i, other_args) # shape 10x10
    C[i*10:(i+1)*10,:10] = C_sub

So, apparently there is no need to run this as a serial calculation, since each submatrix can be calculated independently. I would like to use the multiprocessing module and create up to 4 processes for the for loop. I read some tutorials about multiprocessing, but wasn't able to figure out how to use this to solve my problem.

Thanks for your help

4
  • 2
    In order for multiprocessing to yield performance improvement the computations must take significant time. Because multiprocessing is going to serialize the data, send it to the subprocesses, deserialize it and perform the computations, serialize the result, send it back to the main process and finally deserialize it. Serialization/deserialization take quite some time plus inter-process communication isn't that fast too. If get_sub_matrix is literally just a few matrix accesses you aren't going to obtain any speedup. Commented Mar 8, 2016 at 12:48
  • This is just for illustration purpose. In the end my matrix will have dimensions about 100000 x 20000, but what is more important the get_sub_matrix_C is kind of slow and I think I cant make that function any faster. Commented Mar 8, 2016 at 12:52
  • Does get_sub_matrix_C need to access all the matrix or just the submatrix? because, if it need it all, the serialization of one copy of the big matrix for each subproccess will be very time and memory consuming. Commented Mar 8, 2016 at 12:54
  • Actually, get_sub_matrix_C doesn't depend on any entries of C. It just gives the submatrix that I want to write in C, where i determines the "position". Commented Mar 8, 2016 at 12:57

2 Answers 2

4

A simple way to parallelize that code would be to use a Pool of processes:

pool = multiprocessing.Pool()
results = pool.starmap(get_sub_matrix_C, ((i, other_args) for i in range(10)))

for i, res in enumerate(results):
    C[i*10:(i+1)*10,:10] = res

I've used starmap since the get_sub_matrix_C function has more than one argument (starmap(f, [(x1, ..., xN)]) calls f(x1, ..., xN)).

Note however that serialization/deserialization may take significant time and space, so you may have to use a more low-level solution to avoid that overhead.


It looks like you are running an outdated version of python. You can replace starmap with plain map but then you have to provide a function that takes a single parameter:

def f(args):
    return get_sub_matrix_C(*args)

pool = multiprocessing.Pool()
results = pool.map(f, ((i, other_args) for i in range(10)))

for i, res in enumerate(results):
    C[i*10:(i+1)*10,:10] = res
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer. Unfortunately I can't test it, since I don't have starmap. Probably I'm using an outdated version of multiprocessing? Version: 0.70a1
@RoSt You can use map and modify the function to accept a single parameter. I've edited the answer to add this solution too.
Thanks for the easy and straightforward solution. It works fine. I would vote you up, but my own reputation is <15, sorry...
0

The following recipe perhaps can do the job. Feel free to ask.

import numpy as np
import multiprocessing

def processParallel():

    def own_process(i, other_args, out_queue):
        C_sub = get_sub_matrix_C(i, other_args)
        out_queue.put(C_sub)            

    sub_matrices_list = []
    out_queue = multiprocessing.Queue()
    other_args = 0
    for i in range(10):
        p = multiprocessing.Process(
                            target=own_process,
                            args=(i, other_args, out_queue))
        procs.append(p)
        p.start()

    for i in range(10):
        sub_matrices_list.extend(out_queue.get())

    for p in procs:
        p.join()

    return sub_matrices_list    

C = np.zeros((100,10))

result = processParallel()

for i in range(10):
    C[i*10:(i+1)*10,:10] = result[i]

2 Comments

Thanks for your answer. I tried it, but I got confusing results. The same entries were repeated over and over again.
I just corrected the bug, sorry. Anyway, the other answer seems more succinct and practical. I will try it myself too! :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.