3

I have a table that contains an 'id' column of type BIGSERIAL. I also have an index for this one column (sort order descending, BTREE, unique).

I often need to retrieve the last 10, 20, 30 entries from a table of millions of entries, like this:

SELECT * FROM table ORDER BY id DESC LIMIT 10

I would have thought it's a pretty clear case: there's an index for this particular field, sort order matches, and I need only 10 entries compared to millions in the whole table, this query definitely uses an index scan.

But it doesn't it does a sequential scan over the whole table.

I try to dig deeper, didn't find anything unusual. The Postgres doc at https://www.postgresql.org/docs/9.6/static/indexes-ordering.html says:

An important special case is ORDER BY in combination with LIMIT n: an explicit sort will have to process all the data to identify the first n rows, but if there is an index matching the ORDER BY, the first n rows can be retrieved directly, without scanning the remainder at all.

But it still doesn't work. Does anybody have any pointers for me? Maybe I'm just not seeing the forrest for the trees anymore... :-(

3
  • 2
    Please edit your question and add the create table statements for the tables in question (including all indexes) and the execution plan generated using explain (analyze, buffers). (not just a simple explain) Formatted text please, no screen shots Commented May 30, 2018 at 6:54
  • Did you try select * from table where id in (select id from table order by id desc limit 10) Commented May 30, 2018 at 6:55
  • @a_horse_with_no_name: I found my problem and posted an answer below. Thanks for the tips with the additional information, I will post those next time I have a question related to his. DB stuff was outside my comfort zone so I wasn't even sure what information to post here :-) Commented May 30, 2018 at 7:17

2 Answers 2

3

Ok, saying it out loud and trying to gather more information to put into my question apparently made me see the forrest again, I found the actual problem. Further down in the doc I mentioned above is this sentence:

An index stored in ascending order with nulls first can satisfy either ORDER BY x ASC NULLS FIRST or ORDER BY x DESC NULLS LAST depending on which direction it is scanned in.

This was the problem. I specified the sort order in the index but I ignored the NULLS FIRST vs. LAST.

Postgres default is NULLS FIRST if you don't mention it explicitly in your query. So what Postgres found was the combination ORDER BY DESC NULLS FIRST which wasn't covered by my index. The combination of both SORT ORDER and NULLS is what matters.

The 2 possible solutions:

  • Either mention NULLS FIRST/LAST accordingly in the query so that it matches the index
  • ...or change the index to NULLS FIRST (which is what I did)

Now Postgres is doing a proper index scan and only touches 10 elements during the query, not all of them.

Sign up to request clarification or add additional context in comments.

2 Comments

I have the same issue, but my serial column has a NOT NULL requirement. You haven't mentioned it, but can you let me know if you have nulls in your serial col? Or does deleting a row inherently leave a null there?
@LAZ I had nulls there. If you have a NOT NULL requirement, this doesn't seem to be the right issue for you. This answer here deals with the situation that you have Nulls in fields that you have an index on. Or in other words: when there are normal, non-null values, it's easy for Postgres to sort and search. It knows that a 5 is larger than a 1 and that a Z is "larger" than an A etc.. But once you throw Nulls into this and you try to compare a Null with, let's say, a number 5, it's not that clear anymore what the result should be. There's different options depending on your data and use case.
-1

If you need to get last 10 entries in table you can use this:

SELECT * 
FROM table 
WHERE id >= (SELECT MAX(id) FROM table) - 10 
ORDER BY id DESC

And similarly for 20 and 30 entries. This looks not so clear, but works fast as long as you have index for 'id' column.

2 Comments

This only works if you do not have gaps in your id values
That's true. SERIAL / BIGSERIAL (which is what I used as the ID column) are strictly sequential, but not necessarily without gaps.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.