2

I have the need to create a system to store python data structures on a linux system but have concurrent read and write access to the data from multiple programs/daemons/scripts. My first thought is I would create a unix socket that would listen for connections and serve up requested data as pickled python data structures. Any writes by the clients would get synced to disk (maybe in batch, though I don't expect it to be high throughput so just Linux vfs caching would likely be fine). This ensures only a single process reads and writes to the data.

The other idea is to just keep the pickled data structure on disk and only allow a single process access through a lockfile or token... This requires all accessing clients to respect the locking mechanism / use the access module.

What am I over looking? SQLite is available, but I'd like to keep this as simple as possible.

What would you do?

3
  • What kind of data is it? Commented May 27, 2011 at 2:30
  • configuration data; settings for all kinds of system stuff. Commented May 27, 2011 at 2:31
  • Yea it looks like SQLite is going to be the easiest way, and definitely more reliable than most homebrew stuff. Commented May 27, 2011 at 7:34

5 Answers 5

1

I would just use SQLite if it's available.

See this FAQ: http://www.sqlite.org/faq.html#q5 -- SQLite (with pysqlite [0]) should be able handle your concurrency elegantly.

You can keep the data as simple key-value pairs if you like, there's no need to go all BNF on your data.

[0] http://trac.edgewall.org/wiki/PySqlite

Sign up to request clarification or add additional context in comments.

Comments

1

If you want to just store name/value pairs (e.g. filename to pickled data) you can always use Berkley DB (http://code.activestate.com/recipes/189060-using-berkeley-db-database/). If your data is numbers-oriented, you might want to check out PyTables (http://www.pytables.org/moin). If you really want to use sockets (I would generally try to avoid that, since there's a lot of minutia you have to worry about) you may want to look at Twisted Python (good for handling multiple connections via Python with no threading required).

Comments

0

I'd use a database. A real one. This is why they exist (well, one of the reasons). Don't reinvent the wheel if you don't have to.

1 Comment

This is for an embedded system- sqlite is the only option. Im trying to avoid having to convert my objects to a relational database if possible
0

Leaving backend storage aside (plenty of options here, including ConfigParser, shelf, sqlite and anydbm), the idea with a single process handling storage and others connecting to it may be usable. My first thought for doing that is Pyro (Python remote objects). Sockets, while always available, can get tricky.

1 Comment

This looks exciting! I'll have to look into it- is it pure Python? (cross compiling C based python modules is a royal pain)
0

You could serialize the data structures and store them as values using ConfigParser. If you created your own access lib/module to the access the data, you could do the serialization in the lib so the client code would just send and receive python objects. You could also handle concurrency in the lib.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.