Merge join in PostgreSQL performs sort on indexed column

Question

I am trying to optimize the following query in postgresql

    SELECT ci.country_id, ci.ci_id,ci.name
    FROM customer c
        INNER JOIN address a ON c.a_id = a.a_id
        INNER JOIN city ci ON ci.ci_id = a.ci_id

The columns customer.a_id, address.a_id, city.ci_id and adress.ci_id all have an btree index. I wanted to use a merge join instead of a hash join as I read that a hash join not really uses indexes so I turned of the hash joins with Set enable_hashjoin=off.

My query is now according to the query plan using a merge join but it is performing always a quick sort before the merge join. I know that for merge join the columns need to be sorted but they should already be sorted through the index. Is there a way to force Postgres to use the index and not to perform the sort?

Picture of the query plan

Why do you think the merge join would be more efficient? You are reading all rows from all tables which isn't a situation where an index would help — user330315
– user330315, Commented Nov 1, 2022 at 17:53
Could you please share the results from explain (analyze, verbose, buffers, costs) for this query? — Frank Heikens
– Frank Heikens, Commented Nov 1, 2022 at 17:53
the table size of all the tables is not very large so I was wondering if maybe building the hash tables for the hash join takes more time than using a merge join and even if merge join is not faster I am still intrested in why it is not using the sorted indexes. I have added a picture of the query plan to the question — user20390376
– user20390376, Commented Nov 1, 2022 at 18:07
Execution plans are better shared as formatted text. To make sure you preserve the indention of the plan, edit your question, paste the text, then put ``` on the line before the plan and on a line after the plan. Please share the plan using the merge join and the plan using the hash join. — user330315
– user330315, Commented Nov 1, 2022 at 18:38
the query plan is hard to read, but what I do see is the time spent: About 2 milliseconds. What kind of performance are you looking for when 2 milliseconds is already an issue for you? — Frank Heikens
– Frank Heikens, Commented Nov 1, 2022 at 19:00

jjanes · Accepted Answer · 2022-11-01 23:16:02Z

0

You are joining three tables. It is using two merge joins to do that, with the output of one merge join being one input of the other. The intermediate table is joined using two different columns, but it can't be ordered on two different columns simultaneously, so if you are only going to use merge joins, you need at least one sort.

This whole thing seems pointless, as the query is already very fast, and why do you care if it uses a hash join or not?

answered Nov 1, 2022 at 23:16

jjanes

44.9k5 gold badges39 silver badges48 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Merge join in PostgreSQL performs sort on indexed column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related