PostgreSQL query plans when using unnest() in SELECT list vs in subquery

Ask Question

Asked 1 year, 9 months ago

Modified 1 year, 9 months ago

Viewed 109 times

I have two queries that would produce equivalent results (though the subquery one is sorted). However, it seems like the one that uses subquery is faster and I'm not sure why. Could someone explain this? I’m using PostgreSQL 14.

Query 1:

SELECT unnest(articles.tags) AS unnested_tags
FROM articles
WHERE articles.user_id = '81c96625-3761-4cdd-a7eb-c9d752c7ed12'
GROUP BY unnested_tags;

Analyze result:

 HashAggregate  (cost=304859.08..305073.95 rows=17190 width=32) (actual time=340.426..340.570 rows=120 loops=1)
   Group Key: unnest(tags)
   Batches: 1  Memory Usage: 793kB
   ->  ProjectSet  (cost=0.56..302452.08 rows=962800 width=32) (actual time=6.938..212.755 rows=814335 loops=1)
         ->  Index Scan using ix_user_article on articles  (cost=0.56..296915.98 rows=96280 width=116) (actual time=0.021..88.745 rows=101759 loops=1)
               Index Cond: (user_id = '81c96625-3761-4cdd-a7eb-c9d752c7ed12'::uuid)
 Planning Time: 0.290 ms
 JIT:
   Functions: 9
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 0.607 ms, Inlining 0.000 ms, Optimization 0.443 ms, Emission 6.493 ms, Total 7.543 ms
 Execution Time: 341.468 ms

Query 2

SELECT certain.unnested_tags
FROM  (
   SELECT unnest(articles.tags) AS unnested_tags
   FROM articles
   WHERE articles.user_id = '81c96625-3761-4cdd-a7eb-c9d752c7ed12'
   ) AS certain
GROUP BY certain.unnested_tags;

Analyze result:

 Group  (cost=311705.74..311753.41 rows=200 width=32) (actual time=224.268..235.861 rows=120 loops=1)
   Group Key: (unnest(articles.tags))
   ->  Gather Merge  (cost=311705.74..311752.41 rows=400 width=32) (actual time=224.252..235.754 rows=318 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Sort  (cost=310705.72..310706.22 rows=200 width=32) (actual time=193.068..193.077 rows=106 loops=3)
               Sort Key: (unnest(articles.tags))
               Sort Method: quicksort  Memory: 30kB
               Worker 0:  Sort Method: quicksort  Memory: 30kB
               Worker 1:  Sort Method: quicksort  Memory: 30kB
               ->  Partial HashAggregate  (cost=310696.07..310698.07 rows=200 width=32) (actual time=192.832..192.860 rows=106 loops=3)
                     Group Key: unnest(articles.tags)
                     Batches: 1  Memory Usage: 40kB
                     Worker 0:  Batches: 1  Memory Usage: 40kB
                     Worker 1:  Batches: 1  Memory Usage: 40kB
                     ->  ProjectSet  (cost=0.56..298661.07 rows=401170 width=32) (actual time=10.415..119.371 rows=271445 loops=3)
                           ->  Parallel Index Scan using ix_user_article on articles  (cost=0.56..296354.35 rows=40117 width=116) (actual time=0.038..47.808 rows=33920 loops=3)
                                 Index Cond: (user_id = '81c96625-3761-4cdd-a7eb-c9d752c7ed12'::uuid)
 Planning Time: 0.224 ms
 JIT:
   Functions: 29
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 3.358 ms, Inlining 0.000 ms, Optimization 1.723 ms, Emission 29.476 ms, Total 34.557 ms
 Execution Time: 236.990 ms

It seems like the second query executes the operations in parallel which I presume is the reason why it is faster though I don't know exactly why it is not done in the first query.

edited Feb 11, 2024 at 6:59

asked Feb 11, 2024 at 2:44

Jacky Boen

8911 gold badge8 silver badges10 bronze badges

Obviously, the Postgres version in use must be declared with any such question.

Erwin Brandstetter
– Erwin Brandstetter

2024-02-11 02:56:13 +00:00
Commented Feb 11, 2024 at 2:56
Do you really have a million tags (rows=962800) for the articles of one single user?

The Impaler
– The Impaler

2024-02-11 04:06:07 +00:00
Commented Feb 11, 2024 at 4:06
@ErwinBrandstetter Sorry, I added it now.

Jacky Boen
– Jacky Boen

2024-02-11 07:00:42 +00:00
Commented Feb 11, 2024 at 7:00
@TheImpaler I actually don’t, I just renamed the tables and indexes hoping it to be easier to think about 😅

Jacky Boen
– Jacky Boen

2024-02-11 07:03:19 +00:00
Commented Feb 11, 2024 at 7:03
Why is this something other than a triviality? Is this a simplification of some other case where the difference is larger?

jjanes
– jjanes

2024-02-11 18:33:36 +00:00
Commented Feb 11, 2024 at 18:33

| Show 1 more comment

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Collectives™ on Stack Overflow

PostgreSQL query plans when using unnest() in SELECT list vs in subquery

0

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest