Have such a problem - table with over 20mln rows.
When i add new column with default - postgresql lock table for over 40 minutes so my application stop working for this time.
So instead of
ALTER TABLE "test" ADD COLUMN "field" boolean DEFAULT True NOT NULL;
I do
ALTER TABLE "test" ADD COLUMN "field" boolean NULL;
ALTER TABLE "test" ALTER COLUMN "field" SET DEFAULT true;
after which every new row will be with true by default, so now i need to update 20mln current rows. I update them in batches:
WITH cte AS (
SELECT id as pk
FROM "test"
WHERE "field" is null
LIMIT 10000
)
UPDATE "test" table_
SET "field" = true
FROM cte
WHERE table_.id = cte.pk
after which i do
ALTER TABLE "test" ALTER COLUMN "field" SET NOT NULL;
And everything is allwright but step when i update rows is so much slow. Can you give my some advice about improving speed of update?
Currently it update 10000 in about 2 minutes. I try decreasing size to 1000 - it was better (3.5 minutes for 10000) but it still to slow. I also try create index before update, but it doesnt give better results (as i understand it will give better result but when huge path of table will be updated).