10

I'd like to generate a single sql query to mass-insert a series of rows that don't exist on a table. My current setup makes a new query for each record insertion similar to the solution detailed in WHERE NOT EXISTS in PostgreSQL gives syntax error, but I'd like to move this to a single query to optimize performance since my current setup could generate several hundred queries at a time. Right now I'm trying something like the example I've added below:

INSERT INTO users (first_name, last_name, uid) 
SELECT ( 'John', 'Doe', '3sldkjfksjd'), ( 'Jane', 'Doe', 'adslkejkdsjfds')
WHERE NOT EXISTS (
  SELECT * FROM users WHERE uid IN ('3sldkjfksjd', 'adslkejkdsjfds')
)

Postgres returns the following error:

PG::Error: ERROR:  INSERT has more target columns than expressions

The problem is that PostgresQL doesn't seem to want to take a series of values when using SELECT. Conversely, I can make the insertions using VALUES, but I can't then prevent duplicates from being generated using WHERE NOT EXISTS.

http://www.techonthenet.com/postgresql/insert.php suggests in the section EXAMPLE - USING SUB-SELECT that multiple records should be insertable from another referenced table using SELECT, so I'm wondering why I can't seem to pass in a series of values to insert. The values I'm passing are coming from an external API, so I need to generate the values to insert by hand.

2
  • SELECT ( 'John', 'Doe', '3sldkjfksjd'), ( 'Jane', 'Doe', 'adslkejkdsjfds') creates two columns which are "structures" (that have 3 fields each), not two rows with 3 columns. Commented Jul 15, 2014 at 22:24
  • what's the correct syntax here? Sorry I'm a SQL newbie Commented Jul 15, 2014 at 22:26

2 Answers 2

33

Your select is not doing what you think it does.

The most compact version in PostgreSQL would be something like this:

with data(first_name, last_name, uid)  as (
   values
      ( 'John', 'Doe', '3sldkjfksjd'),
      ( 'Jane', 'Doe', 'adslkejkdsjfds')
) 
insert into users (first_name, last_name, uid) 
select d.first_name, d.last_name, d.uid
from data d
where not exists (select 1
                  from users u2
                  where u2.uid = d.uid);

Which is pretty much equivalent to:

insert into users (first_name, last_name, uid) 
select d.first_name, d.last_name, d.uid
from (
   select 'John' as first_name, 'Doe' as last_name, '3sldkjfksjd' as uid
   union all
   select 'Jane', 'Doe', 'adslkejkdsjfds'
) as d
where not exists (select 1
                  from users u2
                  where u2.uid = d.uid);
Sign up to request clarification or add additional context in comments.

5 Comments

so i'm creating a CTE which I will then reference in the query.
@Paul: correct, you need to build a result somehow, and the only way to do that is the values clause or a select with constant values.
with data as (first_name, last_name, uid) as () is actually generating a syntax error. I think with data (first_name, last_name, uid) as ( ) is correct.
Is it possible to get the id of all the newly inserted records as well as the previously existing ones?
@a_horse_with_no_name: Thanks so much, this answer helped a lot because most examples only show inserting a single value via select.
2

a_horse_with_no_name's answer actually has a syntax error, missing a final closing right parens, but other than that is the correct way to do this.

Update: For anyone coming to this with a situation like mine, if you have columns that need to be type cast (for instance timestamps or uuids or jsonb in PG 9.5), you must declare that in the values you pass to the query:

-- insert multiple if not exists
-- where another_column_name is of type uuid, with strings cast as uuids
-- where created_at and updated_at is of type timestamp, with strings cast as timestamps
WITH data (id, some_column_name, another_column_name, created_at, updated_at) AS (
  VALUES
    (<id value>, <some_column_name_value>, 'a5fa7660-8273-4ffd-b832-d94f081a4661'::uuid, '2016-06-13T12:15:27.552-07:00'::timestamp, '2016-06-13T12:15:27.879-07:00'::timestamp),
    (<id value>, <some_column_name_value>, 'b9b17117-1e90-45c5-8f62-d03412d407dd'::uuid, '2016-06-13T12:08:17.683-07:00'::timestamp, '2016-06-13T12:08:17.801-07:00'::timestamp)
)
INSERT INTO table_name (id, some_column_name, another_column_name, created_at, updated_at)
SELECT d.id, d.survey_id, d.arrival_uuid, d.gf_created_at, d.gf_updated_at
FROM data d
WHERE NOT EXISTS (SELECT 1 FROM table_name t WHERE t.id = d.id);

a_horse_with_no_name's answer saved me today on a project, but had to make these tweaks to make it perfect.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.