0

I have a matrix of matrices that I need to do computations on (i.e. a x by y by z by z matrix, I need to perform a computation on each of the z by z matrices, so x*y total operations). Currently, I am looping over the first two dimensions of the matrix and performing the computation, but they are entirely independent. Is there a way I can compute these in parallel, without knowing in advance how many such matrices there will be (i.e. x and y unknown).

2
  • Yes, python can do this. Commented Sep 15, 2021 at 1:02
  • A full example including mathematical operations and realistic inputs (could be random numbers, but with a realistic shape) on what you have tried is always beneficial to get a good answer. ie. it is not known if you have a list of 2d-arrays or a simple 3d-array. Generally there are several ways (vectorized numpy solution, Numba, Cython,etc.) to efficiently implement such an algorithm. Commented Sep 15, 2021 at 8:36

3 Answers 3

2

Yes; see the multiprocessing module. Here's an example (tweaked from the one in the docs to suit your use case). (I assume z = 1 here for simplicity, so that f takes a scalar.)

from multiprocessing import Pool

# x = 2, y = 3, z = 1 - needn't be known in advance
matrix = [[1, 2, 3], [4, 5, 6]]

def f(x):
    # Your computation for each z-by-z submatrix goes here.
    return x**2

with Pool() as p:
    flat_results = p.map(f, [x for row in matrix for x in row])
    # If you don't need to preserve the x*y shape of the input matrix,
    # you can use `flat_results` and skip the rest.
    x = len(matrix)
    y = len(matrix[0])
    results = [flat_results[i*y:(i+1)*y] for i in range(x)]

# Now `results` contains `[[1, 4, 9], [16, 25, 36]]`

This will divide up the x * y computations across several processes (one per CPU core; this can be tweaked by providing an argument to Pool()).

Depending on what you're doing, consider trying vectorized operations (as opposed to for loops) with numpy first; you might find that it's fast enough to make multiprocessing unnecessary. (If matrix were a numpy array in this example, the code would just be results = matrix**2.)

Sign up to request clarification or add additional context in comments.

Comments

1

The way I approach parallel processing in python is to define a function to do what I want, then apply it in parallel using multiprocessing.Pool.starmap. It's hard to suggest any code for your problem without knowing what you are computing and how.

import multiprocessing as mp

def my_function(matrices_to_compare, matrix_of_matrices):
    m1 = matrices_to_compare[0]
    m2 = matrices_to_compare[1]
    result = m1 - m2 # or whatever you want to do
    return result

matrices_x = <list of x matrices>
matrices_y = <list of y matrices>

matrices_to_compare = list(zip(matrices_x,matrices_y))

with mp.Pool(mp.cpu_count()) as pool:
    results = pool.starmap(my_function,
       [(x, matrix_of_matrices) for x in matrices_to_compare],
                                   chunksize=1)

Comments

1

An alternative to the multiprocessing pool approach proposed by other answers - if you have a GPU available possibly the most straightforward approach would be to use a tensor algebra package taking advantage of it, such as cupy or torch.

You could also get some more speedup by jit-compiling your code (for cpu) with cython or numba (and then for gpu programming there's also numba.cuda which however requires some background to use).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.