1

I have a table with 28 million rows that I want to update. It has around 60 columns and a ID column (primary key) with an index created on it. I created four new columns and I want to populate them with the data from four columns from other table which also has an ID column with an index created on it. Both tables have the same amount of rows and just the primary key and the index on the IDENTI column. The query has been running for 15 hours and since it is high priority work, we are starting to get nervous about it and we don't have so much time to experiment with queries. We have never update a table so big (7 GB), so we are not sure if this amount of time is normal.

This is the query:

UPDATE consolidated
SET IDEDUP2=uni.IDEDUP2
USE21=uni.USE21
USE22=uni.USE22
PESOXX2=uni.PESOXX2
FROM uni_group uni, consolidated con
WHERE con.IDENTI=uni.IDENTI

How can I make it faster? Is it possible? If not, is there a way to check how much longer it is going to take (without killing the process)?

Just as additional information, we have ran before much more complex queries for 3 million row tables (postgis) and It has taken it about 15 hours as well.

2
  • are you sure both table have an index on IDENTI column ? Commented Nov 29, 2018 at 7:58
  • Yes, both of them have pkey (set when I created the tables) and an index created by a regular CREATE INDEX...ON...() query. Commented Nov 29, 2018 at 8:02

1 Answer 1

4

You should not repeat the target table in the FROM clause. Your statement creates a cartesian join of the consolidated table with itself, which is not what you want.

You should use the following:

UPDATE consolidated con
   SET IDEDUP2=uni.IDEDUP2
       USE21=uni.USE21
       USE22=uni.USE22
       PESOXX2=uni.PESOXX2
FROM uni_group uni
WHERE con.IDENTI = uni.IDENTI
Sign up to request clarification or add additional context in comments.

3 Comments

Does this affect the running time by a lot? Just so I know if it is worth killing the process and restart it or just wait at this point
@A.T.: yes I assume this would affect the running time substantially. The bigger the target table is, the worse your unintentional cross join will be
Thank you very much. It worked. I write for anyone with this problem. The solution you proposed reduced the running time from 15 hours (and still not finished) to 10 minutes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.