5

Here I create a producer-customer program,the parent process(producer) create many child process(consumer),then parent process read file and pass data to child process.

but , here comes a performance problem,pass message between process cost too much time (I think).

for an example ,a 200MB original data ,parent process read and pretreat will cost less then 8 seconds , than just pass data to child process by multiprocess.pipe will cost another 8 seconds , and child processes do the remain work just cost another 3 ~ 4 seconds.

so ,a complete work flow cost less than 18 seconds ,and more than 40% time cost on communication between process , it is much bigger than I used think about ,and I tried multiprocess.Queue and Manager ,they are worse.

I works with windows7 / Python3.4. I had google for several days , and POSH maybe a good solution , but it can't build with python3.4

there I have 3 ways:

1.is there any way can share python object direct between process in Python3.4 ? as POSH

or

2.is it possable pass the "pointer" of an object to child process and child process can recovery the "pointer" to python object?

or

3.multiprocess.Array may be a valid solution , but if I want share complex data structure, such as list, how it works? should I make a new class base on it and provide interfaces as list?

Edit1: I tried the 3rd way,but it works worse.
I defined those value:

p_pos = multiprocessing.Value('i') #producer write position  
c_pos = multiprocessing.Value('i') #customer read position  
databuff = multiprocess.Array('c',buff_len) # shared buffer

and two function:

send_data(msg)  
get_data()

in send_data function(parent process),it copies msg to databuff , and send the start and end position (two integer)to child process via pipe.
than in get_data function (child process) ,it received the two position and copy the msg from databuff.

in final,it cost twice than just use pipe @_@

Edit 2:
Yes , I tried Cython ,and the result looks good.
I just changed my python script's suffix to .pyx and compile it ,and the program speed up for 15%.
No doubt , I met the " Unable to find vcvarsall.bat" and " The system cannot find the file specified" error , and I cost whole day for solved the first one , and blocked by the second one.
Finally , I found Cyther , and all troubles gone ^_^.

1
  • all of I want is a queue that parent put message in to and child process get message out from ,and with out too much move operation such as Pipe or Queue in multiprocess module Commented Sep 25, 2016 at 13:33

2 Answers 2

8

I was at your place five month ago. I looked around few times but my conclusion is multiprocessing with Python has exactly the problem you describe :

I solved this kind of problem by learning C++, but it's probably not what you want to read...

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks . So this bottleneck is cased by python itself.And you said you "solved this kind of problem by learning C++" , it means that you rewrite all program with C++ ? or just the communication between process?
Usually I only rewrite what needs to be, so as you said the communication between process, and I get back my functions using boost.python or Cython. It is hard, but only the first time ;)
@ Jean-Baptiste F. Thanks a lot , I will try the Cython.
You are welcome, please tag the question as answered if you think it's the case!
0

To pass data (especially big numpy arrays) to a child process, I think mpi4py can be very efficient since I can work directly on buffer-like objects.

An example of using mpi4py to spawn processes and communicate (using also trio, but it is another story) can be found here.

1 Comment

thanks for your answer, it's a very old question, and the office in where I asked this question had been closed yet...but still thanks. I accessed the official page of mpi4py, it looks a bit like "fork", I will try it and measure the performance, hope it could work as good as shared memory.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.