Bypass NULL <> NULL for an integer column

Question

I have a table like so in a Postgres DB -

id   dataset_id   entity_id   county_state_id   data
34   31           33413       341               JSOB object
35   31           33413       342               JSOB object
36   31           33413                         JSOB object

I want to insert or update this table based on if a record already exists in the table. I have written the following query to do so -

INSERT INTO entity (id, dataset_id, entity_id, county_state_id, data) 
SELECT
    nextval('id_seq'),
    (SELECT id FROM dataset WHERE name = 'Payer'),
    e.id,
    NULL
    jsonb_build_object
        ('a', a, 
        'b', b,
        'c', c,
        )
    from
entity e
JOIN payer p
ON p.id = e.id
ON CONFLICT (dataset_id, entity_id, data, county_state_id)
DO NOTHING;

I insert the following input into the table -

id   dataset_id   entity_id   county_state_id   data
37   31           33413                         JSOB object

I would expect the above SQL query to not update any records because this record already exists in the table. But it does insert a record. I suspect this is happening because NULL <> NULL and I am trying to insert a NULL into the county_state_id column. That is an integer column so I cannot insert an empty string into it, so I do not know how to get Postgres to recognize that the above record already exists in the table.

@AdrianKlaver Yeah but I don't want to insert a zero because there is no county_state_id that is a 0. — Aaron
– Aaron, Commented Dec 22, 2020 at 1:12
Then create dummy record that has a county_state_id of 0 eg. 'no_county_state_id`. — Adrian Klaver
– Adrian Klaver, Commented Dec 22, 2020 at 1:18

Gordon Linoff · Accepted Answer · 2020-12-22 01:20:21Z

1

If you want to prevent duplicates, you need a unique index or constraint. For this purpose, you need two of them:

-- handle not-NULL case
alter table t add constraint unqc_entity_4 unique (dataset_id, entity_id, data, county_state_id);

alter table t add constraint unqc2_entity_4 unique (dataset_id, entity_id, data, (case when county_state_id is null then -1 else id end);

Happily, do nothing applies to all constraints if none are specified, so you can phrase the insert as:

INSERT . . .
ON CONFLICT DO NOTHING;

Here is a little db<>fiddle illustrating the concept.

edited Dec 22, 2020 at 1:20

answered Dec 22, 2020 at 1:07

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Aaron Over a year ago

I have a unique index that is why I am able to use the ON CONFLICT clause.

Gordon Linoff Over a year ago

@Aaron . . . NULLs are tricky in this context but there is a pretty simple work-around.

Vladimir Baranov Over a year ago

@GordonLinoff, I prefer to avoid using special values like -1 instead of NULL. Postgres supports filtered / partial indexes, which I think fit here pretty well. See my answer.

Vladimir Baranov · Accepted Answer · 2020-12-22 02:02:17Z

0

It looks like a filtered / partial unique index would be suitable here.

Actually, two indexes.

-- this takes care of non-null duplicates
CREATE UNIQUE INDEX IX_entity_NON_NULL ON entity 
(dataset_id, entity_id, county_state_id);


-- this prevents duplicates when county_state_id IS NULL
CREATE UNIQUE INDEX IX_entity_NULL ON entity 
(dataset_id, entity_id)
WHERE (county_state_id IS NULL);

With this approach you don't need to use some special values, like 0 or -1 instead of NULL values.

It is not clear for me from the question whether the data field should be included in the index, include it if necessary.

edited Dec 22, 2020 at 2:02

answered Dec 22, 2020 at 1:52

Vladimir Baranov

32.9k5 gold badges60 silver badges95 bronze badges

2 Comments

Aaron Over a year ago

What would my insert or on conflict clause look like if I did it this way?

Vladimir Baranov Over a year ago

@Aaron, I think that your INSERT statement remains as it is. You would not be able to insert a second row with NULL county_state_id and repeating dataset_id, entity_id. The partial unique index would prevent it.

Collectives™ on Stack Overflow

Bypass NULL <> NULL for an integer column

2 Answers 2

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related