7

What is the best way to use an embedded database, say sqlite in Python:

  1. Should be small footprint. I'm only needing few thousands records per table. And just a handful of tables per database.
  2. If it's one provided by Python default installation, then great. Must be open-source, available on Windows and Linus.
  3. Better if SQL is not written directly, but no ORM is fully needed. Something that will shield me from the actual database, but not that huge of a library. Something similar to ADO will be great.
  4. Mostly will be used through code, but if there is a GUI front end, then that is great
  5. Need just a few pages to get started with. I don't want to go through pages reading what a table is and how a Select statement works. I know all of that.
  6. Support for Python 3 is preferred, but 2.x is okay too.

The usage is not a web app. It's a small database to hold at most 5 tables. The data in each table is just a few string columns. Think something just larger than a pickled dictionary

Update: Many thanks for the great suggestions.
The use-case I'm talking about is fairly simple. One you'd probably do in a day or two.
It's a 100ish line Python script that gathers data about a relatively large number of files (say 10k), and creates metadata files about them, and then one large metadata file about the whole files tree. I just need to avoid re-processing the files already processed, and create the metadata for the updated files, and update the main metadata file. In a way, cache the processed data, and only update it on file updates. If the cache is corrupt / unavailable, then simply process the whole tree. It might take 20 minutes, but that's okay.

Note that all processing is done in-memory.

I would like to avoid any external dependencies, so that the script can easily be put on any system with just a Python installation on it. Being Windows, it is sometimes hard to get all the components installed. So, In my opinion, even a database might be an overkill.

You probably wouldn't fire up an Office Word/Writer to write a small post it type note, similarly I am reluctant on using something like Django for this use-case.

Where to start?

3
  • Is this a web or desktop application? Commented Sep 10, 2009 at 19:43
  • not web application. Barely a desktop app. For this particular case, I need to store some metadata about many files and some of their contents. CRUD and UI is not really needed except for debugging. Commented Sep 10, 2009 at 20:02
  • I don't understand this 'lightweight' requirement and how ORM doesn't fit. By 'heavyweight' do you mean: 1. too complicated to figure out 2. memory / disk requirements ? 3. too much functionality ? Commented Sep 11, 2009 at 16:04

9 Answers 9

6

I highly recommend the use of a good ORM. When you can work with Python objects to manage your database rows, life is so much easier.

I am a fan of the ORM in Django. But that was already recommended and you said that is too heavyweight.

This leaves me with exactly one ORM to recommend: Autumn. Very lightweight, works great with SQLite. If your embedded application will be multithreaded, then you absolutely want Autumn; it has extensions to support multithreaded SQLite. (Full disclosure: I wrote those extensions and contributed them. I wrote them while working for RealNetworks, and my bosses permitted me to donate them, so a public thank-you to RealNetworks.)

Autumn is written in pure Python. For SQLite, it uses the Python official SQLite module to do the actual SQL stuff. The memory footprint of Autumn itself is tiny.

I do not recommend APSW. In my humble opinion, it doesn't really do very much to help you; it just provides a way to execute SQL statements, and leaves you to master the SQL way of doing things. Also, it supports every single feature of SQLite, even the ones you rarely use, and as a result it actually has a larger memory footprint than Autumn, while not being as easy to use.

Sign up to request clarification or add additional context in comments.

Comments

3

I started here:

http://www.devshed.com/c/a/Python/Using-SQLite-in-Python

It's 5 (short) pages with just the essentials got me going right away.

4 Comments

Considering the date of that post... docs.python.org/library/sqlite3.html pysqlite has been available as sqlite3 since Python 2.5.
I think an ORM such as Autumn is a much easier way to work with a database. SQL can be tricky; it's really nice to have an ORM hide the messy details, and just focus on your data and what you want to do with it.
@steveha: That depends on what you are doing, and your understanding of SQL. If your need is simply to store objects in a database, then an ORM is acceptable, but sometimes your needs might not fit the capabilities of an ORM.
going this route, you'll hack up something that's sort of mimics what Django offers.
3

What you're looking for is SQLAlchemy, which is fast becoming the de facto standard Python data access layer. To make your first experiences with SQLAlchemy even easier, check out Elixir, which is a thin ActiveRecord-style wrapper around SQLAlchemy.

Update: Reread the question and saw the bit about not needing a full ORM. I'd still suggest going the SQLAlchemy route, just because it gives you a ridiculously easy way to work with databases in Python that you can reuse for any kind of database. Time spent working directly with SQLite is wasted once you need to connect to Oracle or something.

3 Comments

I have worked with SQLAlchemy and it is pretty nice. However, it has a much larger memory footprint than Autumn, and it has features that an embedded app is unlikely to need. Our project at RealNetworks started out using SQLAlchemy, flirted with APSW, and then switched to Autumn. (I agree that he shouldn't try to work directly with SQLite, but I do think it is unlikely that he will ever try to embed Oracle in an embedded app.)
@steveha, I'm not suggesting he'll need to embed Oracle in a similar small program. I'm suggesting that at some point he will need to connect to a different type of database, and that learning one powerful and versatile way of doing so will make his life easier down the road.
@Kevin: fair enough. And you did clearly say "connect to Oracle" rather than "embed Oracle"... sorry about that.
2

start with Django

http://www.djangoproject.com/

ORM is the way to go here. You won't regret it. The tutorial here http://docs.djangoproject.com/en/dev/intro/tutorial01/ is fairly gentle.

Why Django/ORM ? Django will have you up an running in about half an hour, will manage your database connections, data management interfaces, etc. Django works SQLLite: you won't need to manage a MySQL/PostGre instance.

EDIT1: You don't need to use the web-app portion of Django for this. You can use the db.Model classes to manipulate your data directly. Whatever standalone app/script you will come up with, you can just use the Django data-model layer. And when you decide you want a web front-end, or atleast would like to edit your data via the admin console - you can post back here and thank me ( or everyone that said use an ORM ) :)

2 Comments

Django, even though it is great, is too "heavy" for my needs. Something much smaller footprint is needed.
what makes it "heavy" ? what "footprint" are you talking about ?
1

This is an aggregate of answers, in no particular order:

Everybody is recommending an ORM layer. Which makes perfect sense, if you really need a database. Well, that was sort of requested in the title :-)

  1. SQLAlchemy
  2. Autumn
  3. Django ORM
  4. Use SQLite official support Pysqlite
  5. Storm
  6. Elixir
  7. Just use Python's own Pickle

But I'm starting to think that if an in-memory database is sufficient, in this will be used in scripts only, not a web app or even a desktop gui, then option 7 is also perfectly valid, provided no transaction support is needed and "database" integrity is not an issue.

2 Comments

if you do decide to use SQLite, then SQLiteSpy (yunqa.de/delphi/doku.php/products/sqlitespy/index) would be the GUI front-end. I personally would use SQLAlchemy with Elixir, but I have to admit I never used Autumn (taking a look now)
@Ayman: SQLite can store in memory, and might be faster for some operations (agregation, for example) than Python's operations on objects, as it stores values, not objects. Should profile and see...
0

Django is perfect for this but the poster is not clear if he needs to actually make a compiled EXE or a web app. Django is only for web apps.

I'm not sure where you really get "heavy" from. Django is grossly smaller in terms of lines of code than any other major web app framework.

2 Comments

@Stephen answer is pretty close to what I want. I'm creating a web app, so I don't think I need Django.
@Ayman: you can use Django's ORM stand-alone. It might not meet your small-footprint requirements though.
0

Another option to add to the other good suggestions: Elixir. It provides a simplified declarative layer on top of SQLAlchemy, so it should be easier to dive into, but it also allows you to call upon the full power of SQLAlchemy if and when you need it.

Comments

0

There's an easy-to-use Python module that meets all the stated objectives:

http://yserial.sourceforge.net/

Serialization + persistance: in a few lines of code, compress and annotate Python objects into SQLite; then later retrieve them chronologically by keywords without any SQL. Most useful "standard" module for a database to store schema-less data.

Surprisingly, there is not much difference between in-memory and persistent solutions for most practical purposes.

As for "shield me from the actual database", with y_serial, one cannot even tell that SQLite is behind it all. If you construct your records as Python dictionaries, you can concentrate on just writing code (not stored procedures).

Comments

0

If you don't want to use an ORM, you can give a try to python-sql to create your SQL queries.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.