0

I have a table with a timestamp field. The table has millions of entries from several years, and I expected to have a query by date.

It was not indexed by date, so I did it like this:

CREATE INDEX dt_crea_day_idx
ON my_table (date(dt_crea at TIME ZONE 'UTC'));

After that, the next query is taking the same time than before the index:

SELECT dt_crea::date as dt_custom, field1, field2
FROM my_table 
WHERE field1='some_value' 
AND    dt_crea::date = '2020-04-23'
ORDER BY dt_custom desc, field2

What can I do to improve the performance of a query like this by date?

Edit: the analysis pifor asked:

Sort  (cost=9616582.37..9616585.03 rows=1064 width=27) (actual time=290355.874..290355.906 rows=670 loops=1)
  Sort Key: field2
  Sort Method: quicksort  Memory: 77kB
  ->  Seq Scan on my_table  (cost=0.00..9616528.88 rows=1064 width=27) (actual time=72308.452..290355.232 rows=670 loops=1)
        Filter: (((field1)::text = 'some_value'::text) AND ((dt_crea)::date = '2020-04-23'::date))
        Rows Removed by Filter: 255195339
Planning time: 0.086 ms
Execution time: 290355.951 ms
3
  • 2
    What is the exact data type of dt_crea ? Did you run ANALYZE my_table ? Please post output of EXPLAIN ANALYZE <your query> Commented Apr 23, 2020 at 19:02
  • 3
    You where condition must use exactly the same expression as the index, e.g. and date(dt_crea at TIME ZONE 'UTC') = ... Commented Apr 23, 2020 at 19:06
  • @a_horse_with_no_name that worked! Maybe I misunderstood some concept. I hoped that having indexed the field as date, it would not matter how I used the date. Originally I tried to index ((dt_crea::date)) but it was throwing an error saying: ERROR: functions in index expression must be marked IMMUTABLE. Do you think this is the best approximation then? indexing by (date(dt_crea at TIME ZONE 'UTC')) and searching by (date(dt_crea at TIME ZONE 'UTC'))? Commented Apr 24, 2020 at 7:57

2 Answers 2

3

I would create a regular index on the column:

CREATE INDEX dt_crea_day_idx ON my_table (dt_crea);

That index will be more versatile, but you will need to change you query slightly so that Postgres can make use of that index:

SELECT dt_crea::date as dt_custom, field1, field2
FROM my_table 
WHERE field1='some_value' 
  AND dt_crea >= date '2020-04-23' --<< the day you are looking for
  AND dt_crea < date '2020-04-24' --<< next day
ORDER BY dt_custom desc, field2;

That index will also be suitable when looking for other ranges e.g. a specific month (which your index wouldn't support):

WHERE dt_crea >= date '2020-04-01'
  AND dt_crea < date '2020-05-01'

For that specific query (with an = condition for the column field1) an index on both columns might be better:

CREATE INDEX dt_crea_day_idx ON my_table (field1, dt_crea);
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. I marked the first answer as valid answer as it answered my specific question. But I think your answer gives me a better solution for my original problem. Thanks a lot.
1

Your query must match your index, and yours does not. You can't query the index with ::date for the same reason you weren't allowed to create the index with that expression.

If you wish to use this strategy, you would need to change your query to match the index, something like:

SELECT dt_crea::date as dt_custom, field1, field2
FROM my_table 
WHERE field1='some_value' 
AND    date(dt_crea at TIME ZONE 'UTC') = '2020-04-23'
ORDER BY dt_custom desc, field2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.