1

This table started out at short term storage for meter data before it was going to be validated and added to some long term storage tables.

Turns out the clients wants to keep this data for a long time since we saved it and it is growing fast.

create table metering_meterreading
(
    id              bigserial                not null.  # Primary Key
    created_at      timestamp with time zone not null,
    updated_at      timestamp with time zone not null,
    timestamp       timestamp with time zone not null,  # BTREE index
    value           numeric(15, 3)           not null,
    meter_device_id uuid                     not null,  # FK to meter_device, BTREE index
    series_id       uuid                     not null   # FK to series, BTREE index
    organization_id uuid                     not null.  # FK to org , BTREE index
);

I am planning on dropping the primary key since (org_id, meter_device_id, series_id, timestamp) makes it unique. It was just added by my ORM (django) and I didn't care when we started.

But since I pretty much always want to filter in organization, meter_device, and series to get a range of time series data I am wondering if it would be more efficient to have a multicolumn index on (organization_id, meter_device_id, series_id, timestamp) instead of the separate indexes.

I read somewhere that if I had a range it should be the rightmost in the index.

This is still not an super efficient table for timeseries data, since it will grow large, but I am planning in fixing that by partitioning on range, or maybe even use Timescale. But before partitioning I would like it to be as efficient as possible to look up data in it.

I also saw an example somewhere that used a separate table to identify the metric:

create table metric
(
    id
    organization_id
    meter_device_id
    series_id
) UNIQE (organization_id, meter_device_id, series_id)
;

create table metering_meterreading
(
    metric_id.      bigserial,  FK to metric, BTREE index
    timestamp       timestamp with time zone not null,  # BTREE index
    value           numeric(15, 3)           not null,
    created_at      timestamp with time zone not null,
    updated_at      timestamp with time zone not null,
);

But I am not sure if that is actually better than just putting them all in table. It might impact ingestion rate since there is another table involved now.

1 Answer 1

2

If (org_id, meter_device_id, series_id, timestamp) uniquely determine a table row, you need to use a multi-column primary key over all of them. So you automatically have a 4-column index on these columns. Just make sure that timestamp is last in the list, then that index will support your query ideally.

Sign up to request clarification or add additional context in comments.

5 Comments

Yeah that would be nice. But the problem is that Django ORM doesn't support composite primary keys. So I would probably have to keeo the id field as primary key for now to not mess the ORM up. So i would have to keep the id as primary key so I can do a get and update to the model and have unique constraint on the filtering fields. If I then use Timescale this seems to become a problem and I have seen people setting the timestamp as primary key and then dropping the constraint that django generates manually in the migration file. blog.ashtonhudson.com/adding-timescale-to-django.html
Another django timescale implementation just drops the constraint on the id field that is the standard pk field in Django automatically when they create the migration. github.com/schlunsen/django-timescaledb/blob/main/timescale/db/…
I'd say that you should pick a better ORM. It is sad if you are limited by your tools and have to choose a worse solution because of that.
OK. It is too good just to throw out just because of that. And the cost would be insane to the project. So I guess the best thing would be to set up a separate uniqe constrant on (org_id, meter_device_id, series_id, timestamp) and this would give me the same indexs.
Right. I understand that carrying around an extra index may not be painful enough to reconsider a fundamental tool choice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.