0

I'm trying to store some measurement data into my postgresql db using Python Django. So far all good, i've made a docker container with django, and another one with the postgresql server. However, i am getting close to 2M rows in my measurement table, and queries start to get really slow, while i'm not really sure why, i'm not doing very intense queries.

This query

SELECT ••• FROM "measurement" WHERE "measurement"."device_id" = 26 ORDER BY "measurement"."measure_timestamp" DESC LIMIT 20

for example takes between 3 and 5 seconds to run, depending on which device i query.

I would expect this to run a lot faster, since i'm not doing anything fancy. The measurement table

id INTEGER
measure_timestamp TIMESTAMP WITH TIMEZONE
sensor_height INTEGER
device_id INTEGER

with indices on id and measure_timestamp. The server doesn't look too busy, even though it's only 512M memory, i have plenty left during queries.

I configured the postgresql server with shared_buffers=256MB and work_mem=128MB. The total database is just under 100MB, so it should easily fit. If i run the query in PgAdmin, i'm seeing a lot of Block I/O, so i suspect it has to read from disk, which is obviously slow.

Could anyone give me a few pointers in the right direction how to find the issue?

EDIT: Added output of explain analyze on a query. I now added index on the device_id, which helped a lot, but i would expect even quicker query times. https://pastebin.com/H30JSuWa

2
  • 4
    Run EXPLAIN (ANALYZE, BUFFERS) on the query and add the results to your question. That will help giving an answer that is not based on guesses only. Commented Apr 25, 2017 at 9:47
  • Also: add the table definition to your question, including PK FK and indexes. And some discription of the data, such as cardinality. Commented Apr 25, 2017 at 10:11

2 Answers 2

1

Do you have indexes on measure_timestamp and device_id? If the queries always take that form, you might also like multi-column indexes.

Sign up to request clarification or add additional context in comments.

4 Comments

Well i thought i did, but apparently not on device_id, so i fixed that, and it helped a lot, but i'm not there yet, i think it can get a lot faster.
Your query does an order by desc. Would you want to try creating the index also in the same fashion? postgresql.org/docs/current/static/indexes-ordering.html
That did the trick, together with the other things mentioned in the answers. I wasn't aware indexes are "one way", and that i could index them descending. Querytime went from 5 seconds to 22ms. Thanks!
Good to know that. You could upvote my comment if it helped. Thanks.
1

Please look at the distribution key of your table. It is possible that the data is sparsely populated hence it affects the performance. Selecting a proper distribution key is very important when you have data of 2M records. For more details read this on why distribution key is important

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.