1

I have a simple select query:

SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id

Then I want to get the first 100 results, so I use this:

SELECT * FROM entities WHERE entity_type_id = 1 ORDER BY entity_id LIMIT 100

The problem is that the second query works much slower then the first one. It takes less than a second to execute the first query and more than a minute to execute the second one.

These are execution plans for the queries:

without limit:

Sort  (cost=26201.43..26231.42 rows=11994 width=72)
  Sort Key: entity_id
  ->  Index Scan using entity_type_id_idx on entities  (cost=0.00..24895.34 rows=11994 width=72)
        Index Cond: (entity_type_id = 1)

with limit:

Limit  (cost=0.00..8134.39 rows=100 width=72)
  ->  Index Scan using xpkentities on entities  (cost=0.00..975638.85 rows=11994 width=72)
        Filter: (entity_type_id = 1)

I don't understand why these two plans are so different and why the performance decreases so much. How should I tweak the second query to make it work faster?

I use PostgreSql 9.2.

12
  • Is entity_id primary key or does it have any index? Commented Nov 13, 2013 at 7:36
  • entity_id is a primary key and there's an index on entity_type_id Commented Nov 13, 2013 at 7:50
  • Then you don't need order by as it is already sorted by entity_id ascending as default. Commented Nov 13, 2013 at 7:52
  • I get different results it I don't use order by. Commented Nov 13, 2013 at 8:01
  • @Kuzgun - don't be ridiculous. Nothing is sorted until you specify an "order by". Commented Nov 13, 2013 at 14:12

1 Answer 1

1

You want the 100 smallest entity_id's matching your condition. Now - if those were numbers 1..100 then clearly using the entity_id index is the best way to handle this - everything is pre-sorted. In fact, if the 100 you wanted were in the range 1..200 then it still makes sense. Probably 1..1000 would.

So - PostgreSQL thinks it will find lots of entity_type_id=1 values at the "start" of the table. It estimates a cost of 8134 vs 26231 to filter by type then sort. In your case it is wrong.

Now - either there is some correlation which isn't obvious (that's bad - we can't tell the planner about that at present), or we don't have up-to-date or sufficient stats.

Does an ANALYZE entities make any difference? You can see what values the planner knows about by reading the planner-stats page in the manuals.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.