1

I have some Pickled data, which is stored on disk, and it is about 100 MB in size.

When my python program is executed, the picked data is loaded using the cPickle module, and all that works fine.

If I execute the python multiple times using python main.py for example, each python process will load the same data multiple times, which is the correct behaviour.

How can I make it so, all new python process share this data, so it is only loaded a single time into memory?

2 Answers 2

2

If you're on Unix, one possibility is to load the data into memory, and then have the script use os.fork() to create a bunch of sub-processes. As long as the sub-processes don't attempt to modify the data, they would automatically share the parent's copy of it, without using any additional memory.

Unfortunately, this won't work on Windows.

P.S. I once asked about placing Python objects into shared memory, but that didn't produce any easy solutions.

Sign up to request clarification or add additional context in comments.

1 Comment

"automatically share the parent process's data, without using any additional memory" not 100% true. It will be copy-on-write, so it will copy and will use additional memory as soon as you're going to access this data for modification.
0

Depending on how seriously you need to solve this problem, you may want to look at memcached, if that is not overkill.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.