0

I have the following SQL Statement:

SELECT p.name
FROM person p
WHERE EXISTS (
    SELECT 1
    FROM task t
    WHERE t.person_id = p.id
)
LIMIT 100

The task table contains millions of entries. But somehow Postgres thinks it is smart to execute the inner select first. This results in the query to run for several minutes.

If I change the SELECT 1 to SELECT COUNT(1), I can trick Postgres into estimating that the inner select is more expensive. This results in the query being completed in less than a second.

How can I optimize the execution plan of Postgres without changing the above query?

1
  • 1
    LIMIT-ing without ORDER-ing rarely makes sense. Commented Jan 8, 2021 at 19:53

1 Answer 1

1

Do you have an index on task(person_id)?

Without such an index, you might find that a join is a better choice for the query.

Sign up to request clarification or add additional context in comments.

4 Comments

The index will most likely fix the performance. On the other hand the join may return a different result.
Even worse: the JOIN could be a terrible choice.
I do have an index on task(person_id)
@Nibor . . . Interesting. I would expect Postgres to choose a better execution plan then. Perhaps the statistics are not up-to-date.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.