0

We have about 100 simple changes to make to our DB schema like this:

alter table transactions alter customer_sport_id type bigint;

Before it was int4. Most of the changed columns have one or more indexes.

Each is taking about 30-45 minutes on a powerful dedicated RDS instance (db.r6i.4xlarge) with no other load.

We have to commit after each line to avoid using up the entire storage.

The problem is that its slow slow, it will take days to make the changes, and we cant be down that long.

Is there anything we can do to speed up these? E.g.

  1. dropping indexes then creating them again after? (would this speed it up?)
  2. disabling WAL? Not sure if this is feasible or is risky (e.g. can DB get corrupted if migration fails half way through)
  3. Creating a new table, then some how copying all the old data across to the new table (could we do this in SQL, or would it require a stored procedure?), dropping the old table, then creating the sequences and indexes on new table?

Apparently , we run vacuum once a week.

Here is the Database performance stats for the last hour (you can see from the release of storage that two statements have completed):

enter image description here

2 Answers 2

1

If you're altering several columns on the same table, you can do it in one command:

ALTER TABLE foo ALTER col_a TYPE bigint, ALTER col_b TYPE bigint;

If it needs to rebuild the table, it will still have to do it... which also requires rebuilding all the indices... but it will only do it once for the whole ALTER command.

So if you need to change many columns in the same table, it will be much faster.

dropping indexes then creating them again after?

If you issue many ALTER commands that each require a table rebuild, then dropping the indices will avoid rebuilding them after each command. But it is much simpler to just group all your changes in one ALTER command.

disabling WAL? Not sure if this is feasible or is risky (e.g. can DB get corrupted if migration fails half way through)

Dangerous.

Creating a new table, then some how copying all the old data across to the new table

Yes, you can do CREATE TABLE, then INSERT INTO SELECT. For best performance you can create it UNLOGGED, so it won't be crash proof while building it, then switch it to LOGGED before putting the table in production.

If you just want to change column types, this has almost no advantages... However if the new table is created UNLOGGED then it won't write any WAL while filling it, which could help with your disk space issues. The main use for this is if you want to process the data with functions or dependent subqueries: it will probably be much faster to use an INSERT INTO SELECT which applies your processing with functions and JOINs than updating each row individually.

2
  • I moved all the alter column statments into the corresponding alter table statment, and the update performance improved by around 200%. Commented Jul 24, 2024 at 12:15
  • Excellent news! Commented Jul 24, 2024 at 19:52
0

Changing the column data type from INT to BIGINT requires a table rewrite, which is mostly disk I/O, therefore the only way to speed it up would be to use faster storage.

1
  • Would disabling WAL help? Not sure if this affects replicas. Commented Jul 17, 2024 at 21:37

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.