0

How do I efficiently update multiple rows with particular values for a_id 84, and then all other rows set to 0 for that same a_id?

products

p_id    a_id     best 
 111      81       99
 222      81       99
 666      82       99
 222      83       99  
 111      84       99       
 222      84       99
 333      84       99
 111      85       99  
 222      85       99       

Right now I'm doing this:

SQL Fiddle

update products as u set
  best = u2.best
from (values
  (111, 84, 1),
  (222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id AND u2.a_id = u.a_id
RETURNING u2.p_id, u2.a_id, u2.best

But this only updates the rows within values as expected. How do I also update rows not in values to be 0 with a_id = 84?

Meaning the p_id of 333 should have best = 0. I could explicitly include every single p_id but the table is huge.

  • The values set into best will always be in order from 1 to n, defined by the order of values.
  • The products table has 1 million rows

2 Answers 2

1

Assuming (p_id, a_id) is the PK - or at least UNIQUE and NOT NULL, this is one way:

UPDATE products AS u
SET    best = COALESCE(u2.best, 0)
FROM   products AS p
LEFT   JOIN ( VALUES
   (111, 84, 1),
   (222, 84, 2)
   ) AS u2(p_id, a_id, best) USING (p_id, a_id)
WHERE  u.a_id = 84
AND    u.a_id = p.p_id
AND    u.p_id = p.p_id 
RETURNING u2.p_id, u2.a_id, u2.best;

The difficulty is that the FROM list of an UPDATE is the equivalent of an INNER JOIN, while you need an OUTER JOIN. This workaround adds the table products to the FROM list (which is normally redundant), to act as left table for the LEFT OUTER JOIN. Then the INNER JOIN from products to products works.

To restrict to a_id = 84 additionally, add another WHERE clause saying so. That makes a_id = 84 redundant in the VALUES expression, but keep it there to avoid multiple joins that would only be filtered later. Cheaper.

If you don't have a PK or any other (combination of) UNIQUE NOT NULL columns, you can fall back to the system column ctid for joining products rows. Example:

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks Erwin. Would you personally use your approach or forpas'? I will benchmark but any feedback is welcome. +1
You accepted, but I think it's not fully what you want, yet. You only want to update rows with a_id = 84 and not others? Can there be multiple values for a_id in the same update or just the one?
Sorry, I accepted too quickly ;) You're right. I only want to update rows with a_id = 84. No, there will only be 1 value (say 84) per update. Please revise your answer and I'll accept.
Now it should do what you are after
Perfect, thank you. Interesting feedback as well +1
|
1

Remove the condition u2.a_id = u.a_id from the ON clause and put it in the assignment with a CASE statement:

update products as u set
  best = case when u2.a_id = u.a_id then u2.best else 0 end
from (values
  (111, 84, 1),
  (222, 84, 2)
) as u2(p_id, a_id, best)
where u2.p_id = u.p_id

3 Comments

Your answer is also correct, but 4 minutes behind Erwin's. Any argument for why this approach is superior? I'll certainly benchmark. Thanks
4 mins you say (!) is a big difference. If there is no case of cashing the data, then you should definitely use that. But apart the use of the left join, I don't see any other reason to dfferentiate the performance so much. Even the use of coalesce() would balance the 2 performances, but mine also has the not so efficient conditional assignment.
Yes, 4 minutes is nothing but if it was the only differentiating factor then I'd go with Erwin's answer. However, I need only a_id = 84 to be updated, not the others.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.