4

Let's say we have the following 2 tables:

purchases
-> id
-> classic_id(indexed TEXT)
-> other columns

purchase_items_2(a temporary table)
-> id
-> order_id(indexed TEXT)
-> other columns  

I want to do a SQL join between the 2 tables like so:

Select pi.id, pi.order_id, p.id
from purchase_items_2 pi
INNER JOIN purchases p ON pi.order_id = p.classic.id

This thing should use the indexes no? It is not.

Any clue why?

This is the explanation of the query

INNER JOIN purchases ON #{@table_name}.order_id = purchases.classic_id")
                                   QUERY PLAN
---------------------------------------------------------------------------------
Hash Join  (cost=433.80..744.69 rows=5848 width=216)
   Hash Cond: ((purchase_items_2.order_id)::text = (purchases.classic_id)::text)
   ->  Seq Scan on purchase_items_2  (cost=0.00..230.48 rows=5848 width=208)
   ->  Hash (cost=282.80..282.80 rows=12080 width=16)
         ->  Seq Scan on purchases  (cost=0.00..282.80 rows=12080 width=16)
(5 rows)

When I do a where query

Select pi.id
from purchase_items_2 pi
where pi.order_id = 'gigel'

It uses the index

                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
Bitmap Heap Scan on purchase_items_2  (cost=4.51..80.78 rows=29 width=208)
   Recheck Cond: ((order_id)::text = 'gigel'::text)
   ->  Bitmap Index Scan on index_purchase_items_2_on_order_id  (cost=0.00..4.50 rows=29 width=0)
       Index Cond: ((order_id)::text = 'gigel'::text)\n(4 rows)
3
  • Perhaps the optimizer expects more or less all the records from products to be matched. What version are you using? Commented Jan 16, 2020 at 17:52
  • The version is 12 Commented Jan 16, 2020 at 18:56
  • Here is some information on the query planner: postgresql.org/docs/12/planner-optimizer.html . Commented Jan 17, 2020 at 8:34

1 Answer 1

7

Since you have no WHERE condition, the query has to read all rows of both tables anyway. And since the hash table built by the hash join fits in work_mem, a hash join (that has to perform a sequential scan on both tables) is the most efficient join strategy.

PostgreSQL doesn't use the indexes because it is faster without them in this specific query.

Sign up to request clarification or add additional context in comments.

1 Comment

I understand, thought of that but wasn't finding anything to confirm it to me. Awesome! Thx!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.