0

I have two python scripts that need to communicate large variables to one another: python1.py and python2.py.

Let's say python1.py is running and has created a list variable 'x' which is very large. At the moment, I am saving (pickling) 'x' to the hard drive and then using subprocess to run python2.py which then loads up 'x' from the hard drive (I need to have two different python files because I am trying to parallelize a computation).

Is there an alternative, where I can call python2.py with an argument which is a pointer to memory, and then have python2.py create 'x' based on directly looking it up in the memory?

5
  • Are you using the generator script (which creates your x list) to launch the second script? How large is "very large"? Is the order important of the computation? Commented Sep 20, 2014 at 15:30
  • 2
    mmap should be able to do this. This article might help: blog.schmichael.com/2011/05/15/… Commented Sep 20, 2014 at 15:34
  • If this is linux, you could import python2 and use the multiprocessing module to fork into the function you want to run there. In linux, when you fork, you have the same memory so you don't need to serialize it. On windows, mp serializes anyway so no real benefit. Commented Sep 20, 2014 at 15:39
  • would recommend numpy arrays for this task: docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html Commented Sep 20, 2014 at 15:56
  • @BurhanKhalid Yup, the generator script launches the 2nd script, order is not important, and large is about 50MB. I think mmap was what I was looking for, thanks Lanting. I am on windows and tried out multiprocessing but it keeps crashing for odd reasons. Numpy arrays is also another way to go, thanks marscher Commented Sep 20, 2014 at 17:01

1 Answer 1

1

If you are looking into splitting computation across processes, I would strongly recommend giving the "multiprocessing" module a read which has concepts like process pools, managers and ability to share high-level data structures across process boundaries. For e.g. take a look at "sharing state between two processes" section in the docs. From the docs:

from multiprocessing import Process, Array

def f(a):
    for i in range(len(a)):
        a[i] = -a[i]

if __name__ == '__main__':
    arr = Array('i', range(10))

    p = Process(target=f, args=(arr,))
    p.start()
    p.join()

    print(arr[:])

#output: [0, -1, -2, -3, -4, -5, -6, -7, -8, -9]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.