8

I tried to share data when using the multiprocessing module (python 2.7, Linux), I got different results when using slightly different code:

import os
import time
from multiprocessing import Process, Manager

def editDict(d):
    d[1] = 10
    d[2] = 20
    d[3] = 30


pnum = 3
m = Manager()

1st version:

mlist = m.list()
for i in xrange(pnum):
    mdict = m.dict()
    mlist.append(mdict)
    p = Process(target=editDict,args=(mdict,))
    p.start()

time.sleep(2)
print 'after process finished', mlist

This generates:

after process finished [{1: 10, 2: 20, 3: 30}, {1: 10, 2: 20, 3: 30}, {1: 10, 2: 20, 3: 30}]

2nd version:

mlist = m.list([m.dict() for i in xrange(pnum)]) # main difference to 1st version
for i in xrange(pnum):
    p = Process(target=editDict,args=(mlist[i],))
    p.start()
time.sleep(2)
print 'after process finished', mlist

This generates:

after process finished [{}, {}, {}]

I do not understand why the outcome is so different.

2
  • on which platform are you running this test ? multiprocessing behave differently on linux and on windows, and there are specific requirements on windows than you currently do not meet. Commented Dec 12, 2011 at 15:12
  • Hi, Adrien: Linux 2.6.27.45-0.1-default x86_64 GNU/Linux Commented Dec 12, 2011 at 15:14

1 Answer 1

11

It is because you access the variable by the list index the second time, while the first time you pass the actual variable. As stated in the multiprocessing docs:

Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified.

This means that, to keep track of items that are changed within a container (dictionary or list), you must reassign them after each edit. Consider the following change (for explanatory purposes, I'm not claiming this to be clean code):

def editDict(d, l, i):
    d[1] = 10
    d[2] = 20
    d[3] = 30
    l[i] = d

mlist = m.list([m.dict() for i in xrange(pnum)])
for i in xrange(pnum):
    p = Process(target=editDict,args=(mlist[i], mlist, i,))
    p.start()

If you will now print mlist, you'll see that is has the same output as your first attempt. The reassignment will allow the container proxy to keep track of the updated item again.

Your main issue in this case is that you have a dict (proxy) inside a list proxy: updates to the contained container won't be noticed by the manager, and hence not have the changes you expected it to have. Note that the dictionary itself will be updated in the second example, but you just don't see it since the manager didn't sync.

Sign up to request clarification or add additional context in comments.

2 Comments

@chance : I thought giving it a tick (accepting the answer) is more important than a voting-up. But I guess the answer deserves both. Done
@HongboZhu Well, to my understanding, the answer should be vote up if it is helpful or valuable; and it should be selected when you think it is exactly what you want. Sometimes you need do both :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.