Optimize Postgres Query with nested join

Question

I have a query that takes several seconds in my staging environment, but takes a few minutes in my production environment. The query is the same, but the production environment has many more rows in each table (appx 10x).

The goal of the query is to find all distinct software records which are installed on at least one of the assets in the search results (NOTE: the search criteria can vary across dozens of fields).

The tables:

Assets: dozens of fields on which a user can search - production has several million records
InstalledSoftwares: asset_id (reference), software_id (reference) - each asset has 10-100 installed software records, so there are 10s of millions of records in production.
Softwares: the results. Production has less than 4000 unique software records.

I've removed duplicate WHERE clauses based on the suggestions from Laurenz Albe. How can I make this more efficient?

The Query:

SELECT DISTINCT "softwares"."id" FROM "softwares" 
INNER JOIN "installed_softwares" ON "installed_softwares"."software_id" = "softwares"."id" 
WHERE "installed_softwares"."asset_id" IN 
  (SELECT "assets"."id" FROM "assets" 
   WHERE "assets"."assettype_id" = 3 
   AND "assets"."archive_number" IS NULL
   AND "assets"."expired" = FALSE 
   AND "assets"."local_status_id" != 4 
   AND "assets"."scorable" = TRUE)

Here is the EXPLAIN (analyze, buffers) for the query:

Unique  (cost=153558.09..153588.92 rows=5524 width=8) (actual time=4224.203..5872.293 rows=3525 loops=1)
  Buffers: shared hit=112428 read=187, temp read=3145 written=3164
  I/O Timings: read=2.916
  ->  Sort  (cost=153558.09..153573.51 rows=6165 width=8) (actual time=4224.200..5065.158 rows=1087807 loops=1)
        Sort Key: softwares.id
        Sort Method: external merge  Disk: 19240kB
        Buffers: shared hit=112428 read=187, temp read=3145 written=3164
        I/O Timings: read=2.916
        ->  Hash Join  (cost=119348.05..153170.01 rows=6165 width=8) (actual time=342.860..3159.458 rows=1087807 loops=1)
              Hash Cond: (installed_softwares.software_id = softwares.id)
              Buffers: shared hit=112428 read=187
              I/O Timings: read=2.916
              ->  Gather  (cost=119119.76..152925.53 rows=6165 width=8) (actual time=333.981..1320.277 rows=1087807 loops=1)
                    Workers Planned: 2
                    Workers Launched: 2
                    Buffers: shared hit=112324 read=187
                    I/O Timings: read=2.916
                    ->  Parallel Hash Join  (cost=118119.76..151309.03 rows=2569 width=8) (actual time=331.836..2010.171 rows=362602 loops=3)
                          Hash Cond: (installed_softwares.asset_id = assets.id)
                          Buffers: shared hit=112324 read=187
                          I/O Timings: read=2.916
                          ->  Parallel Seq Scan on installed_softwares  (cost=0.00..30518.88 rows=1017288 width=16) (actual time=0.007..667.564 rows=813396 loops=3)
                                Buffers: shared hit=20159 read=187
                                I/O Timings: read=2.916
                          ->  Parallel Hash  (cost=118065.04..118065.04 rows=4378 width=4) (actual time=331.648..331.651 rows=23407 loops=3)
                                Buckets: 131072 (originally 16384)  Batches: 1 (originally 1)  Memory Usage: 4672kB
                                Buffers: shared hit=92058
                                ->  Parallel Seq Scan on assets  (cost=0.00..118065.04 rows=4378 width=4) (actual time=0.012..302.134 rows=23407 loops=3)
                                      Filter: ((archive_number IS NULL) AND (NOT expired) AND scorable AND (local_status_id <> 4) AND (assettype_id = 3))
                                      Rows Removed by Filter: 1363624
                                      Buffers: shared hit=92058
              ->  Hash  (cost=159.24..159.24 rows=5524 width=8) (actual time=8.859..8.862 rows=5546 loops=1)
                    Buckets: 8192  Batches: 1  Memory Usage: 281kB
                    Buffers: shared hit=104
                    ->  Seq Scan on softwares  (cost=0.00..159.24 rows=5524 width=8) (actual time=0.006..4.426 rows=5546 loops=1)
                          Buffers: shared hit=104
Planning Time: 0.534 ms
Execution Time: 5878.476 ms

VACUUM ANALYZE "assets" and see if anything changes. There is no point in digging deeper until we know that that doesn't fix it. — jjanes
– jjanes, Commented Aug 30, 2022 at 22:05
The slowest part is the sort on disk, what happens when you change to setting for work_mem to let's say one GB? SET work_mem TO '1GB'; and then do your query. But the biggest problem is the plan itself — Frank Heikens
– Frank Heikens, Commented Aug 31, 2022 at 6:22
running VACUUM ANALYZE assets had no effect on the performance. Also, this plan is for the staging server. — Elijah
– Elijah, Commented Aug 31, 2022 at 14:41
Since this is running in AWS on a shared database, it's difficult to adjust the work_mem. I could get it done if there is a general agreement that this would be a huge fix. — Elijah
– Elijah, Commented Aug 31, 2022 at 14:50

Laurenz Albe · Accepted Answer · 2022-08-30 20:58:45Z

2

The problem is the bad row count estimate for the scan on assets. The estimate is so bad (48 rows instead of 23407), because the condition is somewhat redundant:

archived_at IS NULL AND archive_number IS NULL AND
NOT expired AND
scorable AND
local_status_id <> 4 AND local_status_id <> 4 AND
assettype_id = 3

PostgreSQL treats all these conditions as statistically independent, which leads it astray. One of the conditions (local_status_id <> 4) is present twice; remove one copy. The first two conditions seem somewhat redundant too; perhaps one of them can be omitted.

Perhaps that is enough to improve the estimate so that PostgreSQL does not choose the slow nested loop join.

answered Aug 30, 2022 at 20:58

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Elijah Over a year ago

This change does result in a 25% performance improvement in staging. I'll update my question to show the new plan.

Elijah Over a year ago

When making this change in production, the results come back in under 10 seconds where it was originally taking several minutes. This has solved it!

Collectives™ on Stack Overflow

Optimize Postgres Query with nested join

The tables:

The Query:

Here is the EXPLAIN (analyze, buffers) for the query:

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

The tables:

The Query:

Here is the EXPLAIN (analyze, buffers) for the query:

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related