3

I have two simple tables node and node_ip linked using a foreign key like this:

CREATE TABLE node_ip (
    id serial NOT NULL,
    node_id int4 NOT NULL,
    ip inet NULL
);

CREATE TABLE node (
  id serial NOT NULL,
  mac macaddr NULL,
  is_local bool,
  CONSTRAINT node_pkey PRIMARY KEY ( id)

);

ALTER TABLE node_ip ADD CONSTRAINT node_const
   FOREIGN KEY (node_id) REFERENCES node(id);

and the following indexes:

CREATE INDEX idx_node_ip_1 ON node_ip USING btree (ip)
CREATE INDEX idx_node_1    ON node    USING btree (id) WHERE ((NOT is_local) AND ((mac)::text !~~ '02:00:00%'::text))

I am trying to optimize the following query:

select * from node_ip
where ip = '192.168.1.6'
  and node_id in (select id from node
                  where is_local = false
                  and mac::text not like '02:00:00%');

However this is the best i can get:

Gather  (cost=1352.74..29923.00 rows=13921 width=46) (actual time=1.905..32.612 rows=14656 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop Semi Join  (cost=352.74..27530.90 rows=5800 width=46) (actual time=0.694..20.534 rows=4885 loops=3)
        ->  Parallel Bitmap Heap Scan on node_ip  (cost=352.32..22986.04 rows=5800 width=46) (actual time=0.638..3.547 rows=4892 loops=3)
              Recheck Cond: (ip = '192.168.1.6'::inet)
              Heap Blocks: exact=491
              ->  Bitmap Index Scan on idx_node_ip_1  (cost=0.00..348.84 rows=13921 width=0) (actual time=1.381..1.381 rows=14676 loops=1)
                    Index Cond: (ip = '192.168.1.6'::inet)
        ->  Index Only Scan using idx_node_1 on node  (cost=0.42..0.77 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=14676)
              Index Cond: (id = node_ip.node_id)
              Heap Fetches: 4328
Planning Time: 0.616 ms
Execution Time: 33.310 ms

Info about the tables:

select count(*) from node ;   --  500000
select count(*) from node_ip; -- 2500000
select count(*) from node where is_local = false and mac::text not like '02:00:00%'; -- 300000

From the plan it looks like most of the time is spent on the Nested Loop Semi Join, is there any way to speed this up?

Related question: What is the best index available for macaddr type? Where most of my queries are LIKE '02:00:00%'?

Note: Am using postgres 11

3
  • 1
    I dont know enough about Postgres to be sure, but I would test the query using select distinct node_ip.* from node join node_ip on node_ip.id=node.id where <your clause> Commented Jun 15, 2020 at 5:30
  • 1
    33 milliseconds doesn't look too bad. How fast do you need that to be? Did you try an EXISTS condition instead of IN? Commented Jun 15, 2020 at 5:51
  • What plan and timing do you get if you first set enable_nestloop TO off ? Commented Jun 15, 2020 at 14:00

1 Answer 1

1

You should

VACUUM node;

to get rid of the 4328 heap fetches caused by a visibility map that is not up to date. If that helps as it should, consider tuning autovacuum_vacuum_scale_factor for this table so that it gets vacuumed more often.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.