32

What would be an inter-process communication (IPC) framework\technique with the following requirements:

  • Transfer native Python objects between two Python processes
  • Efficient in time and CPU (RAM efficiency irrelevant)
  • Cross-platform Win\Linux
  • Nice to have: works with PyPy

UPDATE 1: the processes are on the same host and use the same versions of Python and other modules

UPDATE 2: the processes are run independently by the user, no one of them spawns the others

0

5 Answers 5

21

Native objects don't get shared between processes (due to reference counting).

Instead, you can pickle them and share them using unix domain sockets, mmap, zeromq, or an intermediary such a sqlite3 that is designed for concurrent accesses.

Sign up to request clarification or add additional context in comments.

8 Comments

What do you think of XML-RPC?
I love XML-RPC but the OP's question focused on cpu efficiency so xml-rpc didn't make the cut.
pickling takes times and CPU but conserves RAM, my requirements are the exact opposite. Is there a way to communicate them without pickling them?
Was looking for a simple example of use of mmap to share data between two independently ran scripts, and finally found one here: Sharing Python data between processes using mmap | schmichael's blog - but it seems that you still have to open a file and store the data to be shared there; mmap (apparently) simply provides a special interface to access this file (I was otherwise hoping mmap could utilize memory directly, bypassing temp files)
@sdaau About mmap being tied to temp files: not really. You can create what is called an anonymous mmap, that doesn't rely on files, but the shared area is only available for threads on the same process (of course), or to children processes forked after the mmap has been created, so it is not useful for the requirements here
|
10

Use multiprocessing to start with.

If you need multiple CPU's, look at celery.

4 Comments

Is multiprocessing relevant for processes that were run interdependently? (not spawned by each other)
@Jonathan: "interdependently"? The multi-processing package provides queues and pipes so that processes can synchronize with each other and pass objects around. Does that qualify as "interdependently"?
I meant independently of course...
@Jonathan: Is this a requirement? If so, please update the question to include all the facts. The package provides numerous features for building distributed servers using internet protocols to communicate. docs.python.org/library/…
7

After some test, I found that the following approach works for Linux using mmap.

Linux has /dev/shm. If you create a shared memory using POSIX shm_open, a new file is created in this folder.

Although python's mmap module does not provide the shm_open function. we can use a normal open to create a file in /dev/shm and it is actually similar and reside in memory. (Use os.unlink to remove it)

Then for IPC, we can use mmap to map that file to the different processes' virtual memory space. All the processes share that memory. Python can use the memory as buffer and create object such as bytes and numpy arrays on top of it. Or we can use it through the ctypes interface.

Of course, process sync primitives are still needed to avoid race conditions.

See mmap doc, ctypes doc and numpy.load which has an mmap_mode option.

2 Comments

I know this answer is quite old.. but I'll give it a shot! Since it is possible to open a file in /dev/shm, what is the purpose of using mmap? Can't I just pass information back and forth between different applications by reading and writing to files in /dev/shm? From my understanding these do not get written to a hard drive?
Although I didn't test on what you said, I feel it should be also fine. But it might be more convenient to map it in order for you to use the memory like a variable instead of a file. Happy to see your updates on the experiment.
6

Both execnet and Pyro mention PyPy <-> CPython communication. Other packages from Python Wiki's Parallel Processing page are probably suitable too.

1 Comment

Afaik execnet must setup it's own processes.
5

Parallel Python might be worth a look, it works on Windows, OS X, and Linux (and I seem to recall I used it on a UltraSPARC Solaris 10 machine a while back). I don't know if it works with PyPy, but it does seem to work with Psyco.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.