I have two simple tables node and node_ip linked using a foreign key like this:
CREATE TABLE node_ip (
id serial NOT NULL,
node_id int4 NOT NULL,
ip inet NULL
);
CREATE TABLE node (
id serial NOT NULL,
mac macaddr NULL,
is_local bool,
CONSTRAINT node_pkey PRIMARY KEY ( id)
);
ALTER TABLE node_ip ADD CONSTRAINT node_const
FOREIGN KEY (node_id) REFERENCES node(id);
and the following indexes:
CREATE INDEX idx_node_ip_1 ON node_ip USING btree (ip)
CREATE INDEX idx_node_1 ON node USING btree (id) WHERE ((NOT is_local) AND ((mac)::text !~~ '02:00:00%'::text))
I am trying to optimize the following query:
select * from node_ip
where ip = '192.168.1.6'
and node_id in (select id from node
where is_local = false
and mac::text not like '02:00:00%');
However this is the best i can get:
Gather (cost=1352.74..29923.00 rows=13921 width=46) (actual time=1.905..32.612 rows=14656 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Nested Loop Semi Join (cost=352.74..27530.90 rows=5800 width=46) (actual time=0.694..20.534 rows=4885 loops=3)
-> Parallel Bitmap Heap Scan on node_ip (cost=352.32..22986.04 rows=5800 width=46) (actual time=0.638..3.547 rows=4892 loops=3)
Recheck Cond: (ip = '192.168.1.6'::inet)
Heap Blocks: exact=491
-> Bitmap Index Scan on idx_node_ip_1 (cost=0.00..348.84 rows=13921 width=0) (actual time=1.381..1.381 rows=14676 loops=1)
Index Cond: (ip = '192.168.1.6'::inet)
-> Index Only Scan using idx_node_1 on node (cost=0.42..0.77 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=14676)
Index Cond: (id = node_ip.node_id)
Heap Fetches: 4328
Planning Time: 0.616 ms
Execution Time: 33.310 ms
Info about the tables:
select count(*) from node ; -- 500000
select count(*) from node_ip; -- 2500000
select count(*) from node where is_local = false and mac::text not like '02:00:00%'; -- 300000
From the plan it looks like most of the time is spent on the Nested Loop Semi Join, is there any way to speed this up?
Related question: What is the best index available for macaddr type? Where most of my queries are LIKE '02:00:00%'?
Note: Am using postgres 11
select distinct node_ip.* from node join node_ip on node_ip.id=node.id where <your clause>EXISTScondition instead ofIN?set enable_nestloop TO off?