Search efficiency of datetime vs date + indexed smallint (PostgreSQL)

Question

Context

We are storing weather forecasts as multiband rasters (using Postgres with PostGIS). Each forecast is identified by its date and time (either 0:00, 6:00, or 12:00). We are looking at about 20 years of data. Besides analyses conducted for research purposes, for which speed isn't of primary importance, there will also be an online dashboard that will display the latest forecast(s) on a map.

The table could have the following columns: datetime, metadata_id, raster_data

I foresee two main use cases: (1) either getting the most recent forecast, or (2) selecting a single past forecast. The queries could look something like this:

SELECT datetime, metadata_id, raster_data FROM myTable ORDER BY datetime DESC LIMIT 1;
SELECT datetime, metadata_id, raster_data FROM myTable WHERE datetime= '2015-10-12 12:00:00'::timestamp;

Broader Context

The above situation actually applies to 6 different tables that only differ in that their data comes from different sources, and different physical variables are represented. I am therefore considering a single table with the following columns: datetime, dataSource_physVar, metadata_id, raster_data

The different sources do have different raster extents, so I am consider using child tables.

Question 1

Given the limited use cases, is it better to use a single datetime field, or rather a date field with an indexed smallint field for the hour?

Question 2

Given the quantity of data (365*3*20 years = 22k lines), is it even worth worrying about efficiency?

Ste Bov · Accepted Answer · 2015-10-13 13:06:47Z

I'm putting down my recommendations for your two questions and additional considerations for your index.

1

I'd keep the timestamp as one field, breaking it into two separate fields for date and a smallint for the hour will reduce the space by 2 bytes per row (aka 43 MB with your 22k row estimation) having to compare two fields for date /time doesn't seem worth the small space gain.

2

22k rows may not be many in comparison to some databases, if you're picking out individual occurrences though then you still want it as fast as possible. finding a specific value from a heap requires a full table scan which can take time, if it's indexed then you do an index seek which means you can find records a lot faster.

Additional

If the data is broken up to different rows by location (eg, '2015/10/13 00:00', 'London', 'sunny') then I'd advise a composite key between the timestamp and location

I don't believe that is the case, it sounds like the data will be stored in one long line, so (datetime timestamp, London (data), Paris(data)), for this just index the timestamp field, as an index would suffice

Hope that helps

Stack Exchange Network

Search efficiency of datetime vs date + indexed smallint (PostgreSQL)

1 Answer 1

Your Answer

Hot Network Questions

Search efficiency of datetime vs date + indexed smallint (PostgreSQL)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions