Update Multiple Rows with Specific Values and Others With Zero

Question

How do I efficiently update multiple rows with particular values for a_id 84, and then all other rows set to 0 for that same a_id?

products

p_id    a_id     best 
 111      81       99
 222      81       99
 666      82       99
 222      83       99  
 111      84       99       
 222      84       99
 333      84       99
 111      85       99  
 222      85       99

Right now I'm doing this:

SQL Fiddle

update products as u set
  best = u2.best
from (values
  (111, 84, 1),
  (222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id AND u2.a_id = u.a_id
RETURNING u2.p_id, u2.a_id, u2.best

But this only updates the rows within values as expected. How do I also update rows not in values to be 0 with a_id = 84?

Meaning the p_id of 333 should have best = 0. I could explicitly include every single p_id but the table is huge.

The values set into best will always be in order from 1 to n, defined by the order of values.
The products table has 1 million rows

Erwin Brandstetter · Accepted Answer · 2019-05-21 02:22:48Z

1

Assuming (p_id, a_id) is the PK - or at least UNIQUE and NOT NULL, this is one way:

UPDATE products AS u
SET    best = COALESCE(u2.best, 0)
FROM   products AS p
LEFT   JOIN ( VALUES
   (111, 84, 1),
   (222, 84, 2)
   ) AS u2(p_id, a_id, best) USING (p_id, a_id)
WHERE  u.a_id = 84
AND    u.a_id = p.p_id
AND    u.p_id = p.p_id 
RETURNING u2.p_id, u2.a_id, u2.best;

The difficulty is that the FROM list of an UPDATE is the equivalent of an INNER JOIN, while you need an OUTER JOIN. This workaround adds the table products to the FROM list (which is normally redundant), to act as left table for the LEFT OUTER JOIN. Then the INNER JOIN from products to products works.

To restrict to a_id = 84 additionally, add another WHERE clause saying so. That makes a_id = 84 redundant in the VALUES expression, but keep it there to avoid multiple joins that would only be filtered later. Cheaper.

If you don't have a PK or any other (combination of) UNIQUE NOT NULL columns, you can fall back to the system column ctid for joining products rows. Example:

Numbering rows consecutively for a number of tables

edited May 21, 2019 at 2:22

answered May 20, 2019 at 18:02

Erwin Brandstetter

668k160 gold badges1.2k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

Casey B. Over a year ago

Thanks Erwin. Would you personally use your approach or forpas'? I will benchmark but any feedback is welcome. +1

Erwin Brandstetter Over a year ago

You accepted, but I think it's not fully what you want, yet. You only want to update rows with a_id = 84 and not others? Can there be multiple values for a_id in the same update or just the one?

Casey B. Over a year ago

Sorry, I accepted too quickly ;) You're right. I only want to update rows with a_id = 84. No, there will only be 1 value (say 84) per update. Please revise your answer and I'll accept.

Erwin Brandstetter Over a year ago

Now it should do what you are after

Casey B. Over a year ago

Perfect, thank you. Interesting feedback as well +1

|

forpas · Accepted Answer · 2019-05-20 18:06:55Z

1

Remove the condition u2.a_id = u.a_id from the ON clause and put it in the assignment with a CASE statement:

update products as u set
  best = case when u2.a_id = u.a_id then u2.best else 0 end
from (values
  (111, 84, 1),
  (222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id

answered May 20, 2019 at 18:06

forpas

165k10 gold badges51 silver badges85 bronze badges

3 Comments

Casey B. Over a year ago

Your answer is also correct, but 4 minutes behind Erwin's. Any argument for why this approach is superior? I'll certainly benchmark. Thanks

forpas Over a year ago

4 mins you say (!) is a big difference. If there is no case of cashing the data, then you should definitely use that. But apart the use of the left join, I don't see any other reason to dfferentiate the performance so much. Even the use of coalesce() would balance the 2 performances, but mine also has the not so efficient conditional assignment.

Casey B. Over a year ago

Yes, 4 minutes is nothing but if it was the only differentiating factor then I'd go with Erwin's answer. However, I need only a_id = 84 to be updated, not the others.

Collectives™ on Stack Overflow

Update Multiple Rows with Specific Values and Others With Zero

2 Answers 2

8 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related