0

I am very new to multi-threading and multi-processing and trying to make for loop parallel. I searched similar questions, and created code based on multiprocessing module.

import timeit, multiprocessing

start_time = timeit.default_timer()

d1 = dict( (i,tuple([i*0.1,i*0.2,i*0.3])) for i in range(500000) )
d2={}

def fun1(gn):
    for i in gn:
        x,y,z = d1[i]
        d2.update({i:((x+y+z)/3)})


if __name__ == '__main__':
    gen1 = [x for x in d1.keys()]
    fun1(gen1)
    #p= multiprocessing.Pool(3)
    #p.map(fun1,gen1)

    print('Script finished')
    stop_time = timeit.default_timer()
    print(stop_time - start_time)

# Output:

Script finished
0.8113944193950299

If I change code like:

#fun1(gen1)
p= multiprocessing.Pool(5)
p.map(fun1,gen1)

I get errors:

for i in gn:
TypeError: 'int' object is not iterable
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
    raise self._value

Any ideas how to make this parallel? MATLAB has a parfor option to make parallel loops. I am trying to make loop parallel using this approach, but it is not working. Any ideas how can I make loops parallel? Also, what if the function returns a value - can I write something like a,b,c=p.map(fun1,gen1) if fun1() returns 3 values?

(Running on Windows python 3.6)

2 Answers 2

1

As @Alex Hall mentioned, remove iteration from fun1. Also, wait till all pool's workers are finished.

PEP8 note: import timeit, multiprocessing is bad practice, split it to two lines.

import multiprocessing
import timeit


start_time = timeit.default_timer()

d1 = dict( (i,tuple([i*0.1,i*0.2,i*0.3])) for i in range(500000) )
d2 = {}

def fun1(gn):
    x,y,z = d1[gn]
    d2.update({gn: ((x+y+z)/3)})


if __name__ == '__main__':
    gen1 = [x for x in d1.keys()]

    # serial processing
    for gn in gen1:
        fun1(gn)

    # paralel processing
    p = multiprocessing.Pool(3)
    p.map(fun1, gen1)
    p.close()
    p.join()

    print('Script finished')
    stop_time = timeit.default_timer()
    print(stop_time - start_time)
Sign up to request clarification or add additional context in comments.

Comments

1

p.map does the looping for you, so remove the for i in gn:.

That is, p.map applies fun1 to each element of gen1, so gn is one of those elements.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.