I'm looking for a SQL-relational-table-like data structure in python, or some hints for implementing one if none already exist. Conceptually, the data structure is a set of objects (any objects), which supports efficient lookups/filtering (possibly using SQL-like indexing).
For example, lets say my objects all have properties A, B, and C, which I need to filter by, hence I define the data should be indexed by them. The objects may contain lots of other members, which are not used for filtering. The data structure should support operations equivalent to SELECT <obj> from <DATASTRUCTURE> where A=100 (same for B and C). It should also be possible to filter by more than one field (where A=100 and B='bar').
The requirements are:
- Should support a large number of items (~200K). The items must be the objects themselves, and not some flattened version of them (which rules out
sqliteand likelypandas). - Insertion should be fast, should avoid reallocation of memory (which pretty much rules out
pandas) - Should support simple filtering (like the example above), which must be more efficient than
O(len(DATA)), i.e. avoid "full table scans".
Does such data structure exist?
Please don't suggest using sqlite. I'd need to repeatedly convert object->row and row->object, which is time consuming and cumbersome since my objects are not necessarily flat-ish.
Also, please don't suggest using pandas because repeated insertions of rows is too slow as it may requires frequent reallocation.