3

Assuming that I have a python list:

def func(arr, i):
    arr[i] = arr[i] + ' hello!'

xyz = ['a','b','c','d','e']

for i in range(len(xyz)):
     func(xyz, i)

for i in xyz:
     print i

and end up

a hello!
b hello!
c hello!
d hello!
e hello!

How do update the elements of the list in parallel using muticore since my list is very large?

I've searched all over and I can't seem to find the answer.

6
  • 3
    what is your expected output? Commented Jul 13, 2017 at 16:11
  • Your question is not clear. Commented Jul 13, 2017 at 16:15
  • Hi Haranadh, the expected output is an updated list: a hello! b hello! c hello! d hello! e hello! but I want to do it in parallel since the size of list could contain thousands of elements Commented Jul 13, 2017 at 16:16
  • You can look into using a multiprocessing.Pool Commented Jul 13, 2017 at 16:18
  • Why do you think updating each element in parallel would be more efficient ? Leaving that for python to decide would be better. Commented Jul 13, 2017 at 16:23

3 Answers 3

2

Thanks to @roganjosh's suggestion, I was able to find an answer:

import numpy as np
from multiprocessing import Pool

arr = ['a','b','c','d','e','f','g']

def edit_array(i):
    return arr[i] + ' hello!'

if __name__=='__main__':
    pool = Pool(processes=4)
    list_start_vals = range(len(arr))
    array_2D = pool.map(edit_array, list_start_vals)
    pool.close()
    print array_2D
Sign up to request clarification or add additional context in comments.

Comments

1

Here's one, relatively simple, way to do it using the multiprocessing module:

import functools
import multiprocessing

def func(arr, i):
    arr[i] = arr[i] + ' hello!'

if __name__ == '__main__':
    manager = multiprocessing.Manager()  # Create a manager to handle shared object(s).
    xyz = manager.list(['a','b','c','d','e'])  # Create a proxy for the shared list object.

    p = multiprocessing.Pool(processes=4)  # Create a pool of worker processes.

    # Create a single arg function with the first positional argument (arr) supplied.
    # (This is necessary because Pool.map() only works with functions of one argument.)
    mono_arg_func = functools.partial(func, xyz)

    p.map(mono_arg_func, range(len(xyz)))  # Run func in parallel until finished

    for i in xyz:
         print(i)

Output:

a hello!
b hello!
c hello!
d hello!
e hello!

Note this is not going to very fast if the list is huge because sharing access to large objects requires a lot of overhead between separate tasks (which run in different memory spaces).

A better approach would use a multiprocessing.Queue which is implemented "using a pipe and a few locks/semaphores" according to the documentation (as opposed to a shared list object whose entire contents will have to be pickled and unpickled multiple times).

Comments

-1

Getting from the question that you are wanting to replace the current value of an item in the list with the new value:

for position, value in enumerate(xyz):
    xyz[position] = '%s hello!' % value

Gives: ['a hello!', 'b hello!', 'c hello!', 'd hello!', 'e hello!']

2 Comments

Hello Martin, thanks for your reply. But how do i do it in parallel?
Use either multithreading or multiprocessing to distribute the work as roganjosh suggests in the comments above

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.