2

I'm testing how join works with hash index in PostgreSQL 16.2. Here is a test table. Just 2 columns with numbers in text format.

create table join_test (
    pk varchar(20), 
    fk varchar(20));

insert into join_test(pk, fk) 
select s::varchar(20), 
       (10000001 - s)::varchar(20) 
from generate_series(1, 10000000) as s;

Then I do simple join.

explain analyze select * 
from join_test t1 
join join_test t2 on t1.pk = t2.fk;
Hash Join  (cost=327879.85..878413.95 rows=9999860 width=28) (actual time=5181.056..18337.596 rows=10000000 loops=1)
  Hash Cond: ((t1.pk)::text = (t2.fk)::text)
  ->  Seq Scan on join_test t1  (cost=0.00..154053.60 rows=9999860 width=14) (actual time=0.070..1643.618 rows=10000000 loops=1)
  ->  Hash  (cost=154053.60..154053.60 rows=9999860 width=14) (actual time=5147.801..5147.803 rows=10000000 loops=1)
        Buckets: 262144  Batches: 128  Memory Usage: 5691kB
        ->  Seq Scan on join_test t2  (cost=0.00..154053.60 rows=9999860 width=14) (actual time=0.024..2163.714 rows=10000000 loops=1)
Planning Time: 0.172 ms
Execution Time: 18718.586 ms

No surprises here, without indexes there is a Hash Join with hash table construction on the fly.

Then I add a hash index on "pk" column. And do same join again.

Nested Loop  (cost=0.00..776349.75 rows=9999860 width=28) (actual time=0.107..85991.520 rows=10000000 loops=1)
  ->  Seq Scan on join_test t2  (cost=0.00..154053.60 rows=9999860 width=14) (actual time=0.062..1399.400 rows=10000000 loops=1)
  ->  Index Scan using join_test_pk_idx on join_test t1  (cost=0.00..0.05 rows=1 width=14) (actual time=0.008..0.008 rows=1 loops=10000000)
        Index Cond: ((pk)::text = (t2.fk)::text)
        Rows Removed by Index Recheck: 0
Planning Time: 0.195 ms
Execution Time: 86490.687 ms

As I understand it, in this particular case, Nested Loop is not really a "nested loop" and rather algorithmically the same as Hash Join, except that it uses an already constructed hash table from the index.

  1. In theory, a query with an index should work faster than without. But in reality it works much worse. I wonder why?
  2. Does it even make sense to use hash index for join, or I should always use btree?

1 Answer 1

3

So what did you do to make this happen? In my hands, the nested loop is understand to be far slower than the hash join. Did you do something silly, like set random_page_cost to zero?

As I understand it, in this particular case, Nested Loop is not really a "nested loop" and rather algorithmically the same as Hash Join

You can play endless word games where everything is "really" something else, but this quickly becomes meaningless where nothing means anything.

The hash join uses sequential scans to read the data and builds a private hash table which is designed to spill to disk efficiently if need be. So it is efficient in both IO and locking. The nested loop uses a shared disk-base structure in which every access might trigger inefficient random IO (which may or may not be fulfilled by cache) and definitely involves some locking and buffer management overhead. If you really want to know the performance details, you could dig into it with perf, for example.

In my hands once I force everything to stay in main memory, much of the slowness of the nested loop (which is only actually 2x slower for me, though the planner thinks it will be 6 times slower) seems to be due to CPU cache-misses. I think the CPU cache gets some read-ahead benefit, at least on my VM VirtualBox machine, and the hash join is better at achieving this than the disk-based index is. But this might be a quirk of my "hardware". I've found that this type of analysis doesn't transfer well to other hardware. This is based on the perf annotation showing the slowest instruction being something which makes no sense other than as due to a CPU stall.

There is little point in using hash indexes over btree indexes here. And even less point in tricking the planner into thinking the nested loop is actually a good idea for this case.

Sign up to request clarification or add additional context in comments.

4 Comments

I don't think that the question is meaningless. After all, a hash index is just a hash table on disk, so why should it perform so much different from a hash join, particularly if the hash uses several batches. Perhaps the overhead of pinning buffers, or the implementation of hash indexes is not so great.
I didn't do anything special, all the settings are "out of the box". I checked the buffers, and yes, Index Scan in Nested Loop does 5 times more I/O than Hash Join. It seems that with this kind of performance hash index is completely useless for join. Moreover, with both hash index and btree indexes (on both columns), the planner still chooses Nested Loop. Despite the fact that Merge Join works much faster here.
@AntonIvanov When you say "out of the box", which box is that? What is the OS and which installation method? Could you give us the parameters that differ from the compiled-in defaults?
@LaurenzAlbe I don't think the question is meaningless (I did answer it, after all) , just the phrasing. If a nested loop is really a hash join, and a hash join is really a nested loop (after all, the source code will reveal many loops) then everything is everything else and words won't stand still while we try to use them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.