0

I am trying to optimize the following query in postgresql

    SELECT ci.country_id, ci.ci_id,ci.name
    FROM customer c
        INNER JOIN address a ON c.a_id = a.a_id
        INNER JOIN city ci ON ci.ci_id = a.ci_id

The columns customer.a_id, address.a_id, city.ci_id and adress.ci_id all have an btree index. I wanted to use a merge join instead of a hash join as I read that a hash join not really uses indexes so I turned of the hash joins with Set enable_hashjoin=off.

My query is now according to the query plan using a merge join but it is performing always a quick sort before the merge join. I know that for merge join the columns need to be sorted but they should already be sorted through the index. Is there a way to force Postgres to use the index and not to perform the sort?

Picture of the query plan

5
  • 2
    Why do you think the merge join would be more efficient? You are reading all rows from all tables which isn't a situation where an index would help Commented Nov 1, 2022 at 17:53
  • 1
    Could you please share the results from explain (analyze, verbose, buffers, costs) for this query? Commented Nov 1, 2022 at 17:53
  • the table size of all the tables is not very large so I was wondering if maybe building the hash tables for the hash join takes more time than using a merge join and even if merge join is not faster I am still intrested in why it is not using the sorted indexes. I have added a picture of the query plan to the question Commented Nov 1, 2022 at 18:07
  • 2
    Execution plans are better shared as formatted text. To make sure you preserve the indention of the plan, edit your question, paste the text, then put ``` on the line before the plan and on a line after the plan. Please share the plan using the merge join and the plan using the hash join. Commented Nov 1, 2022 at 18:38
  • the query plan is hard to read, but what I do see is the time spent: About 2 milliseconds. What kind of performance are you looking for when 2 milliseconds is already an issue for you? Commented Nov 1, 2022 at 19:00

1 Answer 1

0

You are joining three tables. It is using two merge joins to do that, with the output of one merge join being one input of the other. The intermediate table is joined using two different columns, but it can't be ordered on two different columns simultaneously, so if you are only going to use merge joins, you need at least one sort.

This whole thing seems pointless, as the query is already very fast, and why do you care if it uses a hash join or not?

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.