1

My use case, is I need to to a text search on a field, and then order by another column, unrelated to the text search, but I can't seem to create an index that handles both.

Create table:

create table file (
id bigint,
path character varying(2048),
peers bigint,
text_search tsvector
);

Some indices to test:

create index idx_file_text_search_1 on file using gin (text_search);
create index idx_file_text_search_2 on file using gin (peers, text_search);
create index idx_file_peers on file using btree (peers desc);

Here is my main query:

explain analyze 
select * 
from file_fast 
where text_search @@ to_tsquery('whatever') 
order by peers desc 
limit 10;

Yet its only using the peers index:

 Limit  (cost=0.43..20870.27 rows=10 width=316) (actual time=2507.304..9016.220 rows=10 loops=1)
   ->  Index Scan using idx_file_peers on file  (cost=0.43..18286146.09 rows=8762 width=316) (actual time=2507.301..9016.205 rows=10 loops=1)
         Filter: (text_search @@ to_tsquery('ole'::text))
         Rows Removed by Filter: 497504
 Planning time: 0.399 ms
 Execution time: 9016.265 ms
(6 rows)

And when I try it without the order by, it appears to use text searching index:

-------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=104.15..143.54 rows=10 width=316) (actual time=76.949..76.977 rows=10 loops=1)
   ->  Bitmap Heap Scan on file  (cost=104.15..34612.36 rows=8762 width=316) (actual time=76.946..76.970 rows=10 loops=1)
         Recheck Cond: (text_search @@ to_tsquery('ole'::text))
         Heap Blocks: exact=10
         ->  Bitmap Index Scan on idx_file_text_search_1 (cost=0.00..101.96 rows=8762 width=0) (actual time=76.802..76.802 rows=515 loops=1)
               Index Cond: (text_search @@ to_tsquery('ole'::text))
 Planning time: 0.376 ms
 Execution time: 175.775 ms
(8 rows)

Does postgres really lack an index to be able to text search, and sort on another field?

2

1 Answer 1

2

dont know if you can improve the index but if second query is the faster one maybe you can split the query

with cte as (
    select * 
    from file_fast 
    where text_search @@ to_tsquery('whatever') 
)
SELECT *
FROM cte
order by peers desc 
limit 10;
Sign up to request clarification or add additional context in comments.

4 Comments

This is faster, but I'm so confused as to why. This one does use the correct text search index, then does a memory sort (not using my peer index), but its much faster! Why wouldn't postgres know to use the text search index first?
Is like an assembly line. The cte is executed first so the only index need it is the text_search once you have a result data set (smaller than original) a memory table is created and the second query is executed using a memory sort
Thanks! This still seems like something postgres should be smart enough to do. As soon as I add the order by, postgres ignores my gin index.
Not expert on text search, but my guess is you cant have those order together because arent related. In standard composite index like (id1, id2) once you find id1 you just jump to a subtree with id2 already sorted so can be very efficient. But for text search you cant have anything related unless you create all words combination on the index

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.