I'm having issues with the following query in PostgreSQL 10.5.
Using pg_activity while the query runs show that only ~400 MB of RAM is used.
I also noticed that the planner estimates seem to be pretty far off.
This is the query:
explain analyze SELECT
Event.data AS "Event_data", Event_entity_identifiers.identifier AS "Event_rootIdentifier",
expeditions.expedition_code AS "expeditionCode", expeditions.project_id AS "projectId",
Sample.data AS "Sample_data", Sample_entity_identifiers.identifier AS "Sample_rootIdentifier",
fastqMetadata.data AS "fastqMetadata_data", fastqMetadata_entity_identifiers.identifier AS "fastqMetadata_rootIdentifier"
FROM network_1.Event AS Event
JOIN expeditions ON expeditions.id = Event.expedition_id
LEFT JOIN network_1.Sample AS Sample ON Sample.parent_identifier = Event.local_identifier and Sample.expedition_id = Event.expedition_id
LEFT JOIN network_1.Tissue AS Tissue ON Tissue.parent_identifier = Sample.local_identifier and Tissue.expedition_id = Sample.expedition_id
LEFT JOIN network_1.fastqMetadata AS fastqMetadata ON fastqMetadata.parent_identifier = Tissue.local_identifier and fastqMetadata.expedition_id = Tissue.expedition_id
LEFT JOIN entity_identifiers AS Event_entity_identifiers ON Event_entity_identifiers.expedition_id = Event.expedition_id and Event_entity_identifiers.concept_alias = 'Event'
LEFT JOIN entity_identifiers AS Sample_entity_identifiers ON Sample_entity_identifiers.expedition_id = Sample.expedition_id and Sample_entity_identifiers.concept_alias = 'Sample'
LEFT JOIN entity_identifiers AS fastqMetadata_entity_identifiers ON fastqMetadata_entity_identifiers.expedition_id = fastqMetadata.expedition_id and fastqMetadata_entity_identifiers.concept_alias = 'fastqMetadata'
WHERE (expeditions.project_id = 2 AND Event.data ? 'urn:decimalLatitude' AND Event.data ? 'urn:decimalLongitude') AND expeditions.public = true
ORDER BY Event.local_identifier, Event.expedition_id
OFFSET 0 LIMIT 10000;
Explain analyze results:
Limit (cost=3158.61..3158.64 rows=12 width=1658) (actual time=104656.385..104692.387 rows=10000 loops=1)
-> Sort (cost=3158.61..3158.64 rows=12 width=1658) (actual time=104656.383..104682.873 rows=10000 loops=1)
Sort Key: event.local_identifier, event.expedition_id
Sort Method: external merge Disk: 40616kB
-> Nested Loop Left Join (cost=1627.71..3158.39 rows=12 width=1658) (actual time=6.389..104128.516 rows=38826 loops=1)
-> Nested Loop Left Join (cost=20.42..40.63 rows=1 width=625) (actual time=1.329..105.572 rows=5650 loops=1)
-> Nested Loop (cost=20.15..32.26 rows=1 width=607) (actual time=1.300..61.179 rows=5650 loops=1)
-> Bitmap Heap Scan on event (cost=20.00..24.02 rows=1 width=588) (actual time=1.286..17.407 rows=6025 loops=1)
Recheck Cond: ((data ? 'urn:decimalLatitude'::text) AND (data ? 'urn:decimalLongitude'::text))
Heap Blocks: exact=793
-> Bitmap Index Scan on idx_network_1_event_data (cost=0.00..20.00 rows=1 width=0) (actual time=1.168..1.169 rows=6025 loops=1)
Index Cond: ((data ? 'urn:decimalLatitude'::text) AND (data ? 'urn:decimalLongitude'::text))
-> Index Scan using expeditions_pkey on expeditions (cost=0.14..8.16 rows=1 width=23) (actual time=0.004..0.004 rows=1 loops=6025)
Index Cond: (id = event.expedition_id)
Filter: (public AND (project_id = 2))
Rows Removed by Filter: 0
-> Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers event_entity_identifiers (cost=0.28..8.30 rows=1 width=22) (actual time=0.004..0.004 rows=1 loops=5650)
Index Cond: ((expedition_id = event.expedition_id) AND (concept_alias = 'Event'::text))
-> Nested Loop Left Join (cost=1607.29..3117.75 rows=1 width=1060) (actual time=3.372..18.399 rows=7 loops=5650)
-> Nested Loop Left Join (cost=1607.01..3117.36 rows=1 width=1046) (actual time=3.369..18.374 rows=7 loops=5650)
-> Nested Loop Left Join (cost=1606.74..3117.06 rows=1 width=1028) (actual time=3.360..18.316 rows=7 loops=5650)
-> Nested Loop Left Join (cost=1606.46..3109.90 rows=1 width=480) (actual time=3.325..18.079 rows=7 loops=5650)
-> Bitmap Heap Scan on sample (cost=1606.05..2651.62 rows=1 width=476) (actual time=2.389..2.498 rows=7 loops=5650)
Recheck Cond: (expedition_id = event.expedition_id)
Filter: (parent_identifier = event.local_identifier)
Rows Removed by Filter: 430
Heap Blocks: exact=392809
-> Bitmap Index Scan on idx_network_1_sample_local_identifier_expedition_id (cost=0.00..1606.05 rows=312 width=0) (actual time=2.334..2.334 rows=437 loops=5650)
Index Cond: (expedition_id = event.expedition_id)
-> Index Scan using idx_network_1_tissue_local_identifier_expedition_id on tissue (cost=0.41..458.27 rows=1 width=22) (actual time=1.067..2.262 rows=1 loops=38826)
Index Cond: (expedition_id = sample.expedition_id)
Filter: (parent_identifier = sample.local_identifier)
Rows Removed by Filter: 484
-> Index Scan using idx_network_1_fastqmetadata_local_identifier_expedition_id on fastqmetadata (cost=0.28..7.15 rows=1 width=572) (actual time=0.031..0.031 rows=0 loops=38826)
Index Cond: (expedition_id = tissue.expedition_id)
Filter: (parent_identifier = tissue.local_identifier)
Rows Removed by Filter: 3
-> Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers sample_entity_identifiers (cost=0.28..0.30 rows=1 width=22) (actual time=0.005..0.005 rows=1 loops=38826)
Index Cond: ((expedition_id = sample.expedition_id) AND (concept_alias = 'Sample'::text))
-> Index Scan using entitiy_identifiers_expediton_id_concept_alias_uniq on entity_identifiers fastqmetadata_entity_identifiers (cost=0.28..0.39 rows=1 width=22) (actual time=0.001..0.001 rows=0 loops=38826)
Index Cond: ((expedition_id = fastqmetadata.expedition_id) AND (concept_alias = 'fastqMetadata'::text))
Planning time: 6.090 ms
Execution time: 104704.794 ms
Edit:
CREATE TABLE public.expeditions (
id integer NOT NULL,
project_id integer NOT NULL,
expedition_code text NOT NULL,
public boolean DEFAULT true NOT NULL,
);
Indexes:
"expeditions_pkey" PRIMARY KEY, btree (id)
"expeditions_code_project_id_uniq" UNIQUE CONSTRAINT, btree (expedition_code, project_id)
"expeditions_project_id_idx" btree (project_id)
CREATE TABLE public.entity_identifiers (
id integer NOT NULL,
expedition_id integer NOT NULL,
concept_alias text NOT NULL,
identifier text NOT NULL
);
Indexes:
"entitiy_identifiers_expediton_id_concept_alias_uniq" UNIQUE CONSTRAINT, btree (expedition_id, concept_alias)
"entity_identifiers_expedition_id" btree (expedition_id)
CREATE TABLE network_1.event (
id integer NOT NULL,
local_identifier text NOT NULL,
expedition_id integer NOT NULL,
data jsonb NOT NULL,
);
Indexes:
"event_pkey" PRIMARY KEY, btree (id)
"idx_network_1_event_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
"idx_network_1_event_data" gin (data)
Foreign-key constraints:
"event_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE
CREATE TABLE network_1.sample (
id integer NOT NULL,
local_identifier text NOT NULL,
expedition_id integer NOT NULL,
data jsonb NOT NULL,
parent_identifier text NOT NULL
);
Indexes:
"sample_pkey" PRIMARY KEY, btree (id)
"idx_network_1_sample_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
"idx_network_1_sample_data" gin (data)
Foreign-key constraints:
"network_1_sample_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.event(local_identifier, expedition_id) ON DELETE CASCADE
"sample_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE
CREATE TABLE network_1.tissue (
id integer NOT NULL,
local_identifier text NOT NULL,
expedition_id integer NOT NULL,
data jsonb NOT NULL,
parent_identifier text NOT NULL
);
Indexes:
"tissue_pkey" PRIMARY KEY, btree (id)
"idx_network_1_tissue_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
"idx_network_1_tissue_data" gin (data)
Foreign-key constraints:
"network_1_tissue_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.sample(local_identifier, expedition_id) ON DELETE CASCADE
"tissue_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE
CREATE TABLE network_1.fastqmetadata (
id integer NOT NULL,
local_identifier text NOT NULL,
expedition_id integer NOT NULL,
data jsonb NOT NULL,
parent_identifier text NOT NULL
);
Indexes:
"fastqmetadata_pkey" PRIMARY KEY, btree (id)
"idx_network_1_fastqmetadata_local_identifier_expedition_id" UNIQUE CONSTRAINT, btree (local_identifier, expedition_id)
"idx_network_1_fastqmetadata_data" gin (data)
Foreign-key constraints:
"fastqmetadata_expedition_id_fkey" FOREIGN KEY (expedition_id) REFERENCES expeditions(id) ON DELETE CASCADE
"network_1_fastqmetadata_parent_fkey" FOREIGN KEY (parent_identifier, expedition_id) REFERENCES network_1.tissue(local_identifier, expedition_id) ON DELETE CASCADE
analyze network_1.Event;change anything? What is your value forwork_mem- increasing that shouldn't hurt. For one to get rid of the disk sort and maybe the nested loops are then changed to more efficient hash joinswork_memis currently set at 4MBset work_mem='32MB';orset work_mem='64MB';and see if that improves anything.64MBchange to an in memory sort, but only speed up the overall time by1s. New sort line isSort Method: top-N heapsort Memory: 24818kB