2

I'm working with a somewhat large set (~30000 records) of data that my Django app needs to retrieve on a regular basis. This data doesn't really change often (maybe once a month or so), and the changes that are made are done in a batch, so the DB solution I'm trying to arrive at is pretty much read-only.

The total size of this dataset is about 20mb, and my first thought is that I can load it into memory (possibly as a singleton on an object) and access it very fast that way, though I'm wondering if there are other, more efficient ways of decreasing the fetch time by avoiding disk I/O. Would memcached be the best solution here? Or would loading it into an in-memory SQLite DB be better? Or loading it on app startup simply as an in-memory variable?

2 Answers 2

2

The simplest solution I think it's to load all the objects into memory with

cached_records = Record.objects.all()
list(cached_records) # by using list() we force Django load all data into memory

Then you are free to use this cached_records in your app, and you also can use QuerySet methods like filter, etc. But filter on the cached records will trigger DB query.

If you will query these records based on conditions, using cache would be a good idea.

Sign up to request clarification or add additional context in comments.

1 Comment

The problem I'm running into with this method is that the list() function evaluates when the module is imported. This makes all tests fail because the test database doesn't exist yet, so the query crashes, and all is no good. I have this same problem, but this solution is falling short...
0

Does the disk IO really become the bottleneck of your application's performance and affect your user experience? If not, I don't think this kind of optimization is necessary.

Operating system and RDBMS (e.g MySQL , PostgresQL) are really smart nowdays. The data in the disk will be cached in memory by RDBMS and OS automatically.

1 Comment

I am not absolutely sure as to how much of a bottleneck this will create, but since it's only 20mb of data, I figured it would be best to load it into memory for quick access. I'll look into those default caching strategies with the DB, though I might go with a cache layer that I can control more, such as memcached.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.