5

I have a large table (6+ million rows) that I'd like to add an auto-incrementing integer column sid, where sid is set on existing rows based on an ORDER BY inserted_at ASC. In other words, the oldest record based on inserted_at would be set to 1 and the latest record would be the total record count. Any tips on how I might approach this?

2
  • Does the table have a PRIMARY KEY? Commented Feb 14, 2019 at 20:05
  • Yes, in my specific case primary key id is a UUID. Commented Feb 14, 2019 at 20:58

1 Answer 1

6

Add a sid column and UPDATE SET ... FROM ... WHERE:

UPDATE test
SET sid = t.rownum
FROM (SELECT id, row_number() OVER (ORDER BY inserted_at ASC) as rownum
    FROM test) t
WHERE test.id = t.id

Note that this relies on there being a primary key, id. (If your table did not already have a primary key, you would have to make one first.)


For example,

-- create test table
DROP TABLE IF EXISTS test;
CREATE TABLE test (
    id int PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY
    , foo text
    , inserted_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO test (foo, inserted_at) VALUES
('XYZ', '2019-02-14 00:00:00-00')
, ('DEF', '2010-02-14 00:00:00-00')
, ('ABC', '2000-02-14 00:00:00-00');

-- +----+-----+------------------------+
-- | id | foo |      inserted_at       |
-- +----+-----+------------------------+
-- |  1 | XYZ | 2019-02-13 19:00:00-05 |
-- |  2 | DEF | 2010-02-13 19:00:00-05 |
-- |  3 | ABC | 2000-02-13 19:00:00-05 |
-- +----+-----+------------------------+

ALTER TABLE test ADD COLUMN sid INT;

UPDATE test
SET sid = t.rownum
FROM (SELECT id, row_number() OVER (ORDER BY inserted_at ASC) as rownum
    FROM test) t
WHERE test.id = t.id

yields

+----+-----+------------------------+-----+
| id | foo |      inserted_at       | sid |
+----+-----+------------------------+-----+
|  3 | ABC | 2000-02-13 19:00:00-05 |   1 |
|  2 | DEF | 2010-02-13 19:00:00-05 |   2 |
|  1 | XYZ | 2019-02-13 19:00:00-05 |   3 |
+----+-----+------------------------+-----+

Finally, make sid SERIAL (or, better, an IDENTITY column):

ALTER TABLE test ALTER COLUMN sid SET NOT NULL;
-- IDENTITY fixes certain issue which may arise with SERIAL
ALTER TABLE test ALTER COLUMN sid ADD GENERATED BY DEFAULT AS IDENTITY;
-- ALTER TABLE test ALTER COLUMN sid SERIAL;
Sign up to request clarification or add additional context in comments.

1 Comment

I'm using this to create a new id column. But this results in new records starting at id = 1 creating duplicates in the id-column. How to prevent that?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.