I have a huge database in sqlite3 of 41 million rows in a table. However, it takes around 14 seconds to execute a single query. I need to significantly improve the access time! Is this a hard disk hardware limit or a processor limit? If it is a processor limit then I think I can use the 8 processors I have to parallelise the query. However I never found a way to parallelize queries in SQLite for python. Is there any way to do this? Can I have a coding example? Or are other database programs more efficient in access? If so then which ones?
1 Answer
Firstly, make sure any relevant indexes are in place to assist in efficient queries -- which may or may not help...
Other than that, SQLite is meant to be a (strangely) lite embedded SQL DB engine - 41 million rows is probably pushing it depending on number and size of columns etc...
You could take your DB and import it to PostgreSQL or MySQL, which are both open-source RDMS's with Python bindings and extensive feature sets. They'll be able to handle queries, indexing, caching, memory management etc... on large data effectively. (Or at least, since they're designed for that purpose, more effectively than SQLite which wasn't...)