1

I have to do a bit complicated data import. I need to do a number of UPDATEs which currently updating over 3 million rows in one query. This query is applying about 30-45 sec each (some of them even 4-5 minutes). My question is, whether I can speed it up. Where can I read something about it, what kind of indexes and on which columns I can set to improve those updates. I don't need exacly answer, so I don't show the tables. I am looking for some stuff to learn about it.

1
  • Please post the execution plan for the UPDATE statement (either as formatted code here or as a link to explain.depesz.com). You might also want to read this article: wiki.postgresql.org/wiki/SlowQueryQuestions in order to find out which information is helpful when posting this kind of question Commented Jul 8, 2011 at 10:09

2 Answers 2

8

Two things:

1) Post an EXPLAIN ANALYZE of your UPDATE query.

2) If your UPDATE does not need to be atomic, then you may want to consider breaking apart the number of rows affected by your UPDATE. To minimize the number of "lost rows" due to exceeding the Free Space Map, consider the following approach:

  1. BEGIN
  2. UPDATE ... LIMIT N; or some predicate that would limit the number of rows (e.g. WHERE username ilike 'a%';).
  3. COMMIT
  4. VACUUM table_being_updated
  5. Repeat steps 1-4 until all rows are updated.
  6. ANALYZE table_being_updated

I suspect you're updating every row in your table and don't need all rows to be visible with the new value at the end of a single transaction, therefore the above approach of breaking the UPDATE up in to smaller transactions will be a good approach.

And yes, an INDEX on the relevant columns specified in the UPDATE's predicate will help will dramatically help. Again, post an EXPLAIN ANALYZE if you need further assistance.

Sign up to request clarification or add additional context in comments.

Comments

0

If by a number of UPDATEs you mean one UPDATE command to each updated row then the problem is that all the target table's indexes will be updated and all constraints will be checked at each updated row. If that is the case then try instead to update all rows with a single UPDATE:

update t
set a = t2.b 
from t2
where t.id = t2.id

If the imported data is in a text file then insert it in a temp table first and update from there. See my answer here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.