How does data alignment after altering columns in Postgres

Question

I've got a table in Postgres that I have optimized the column order have as little padding as possible after having read Calculating and saving space in PostgreSQL.

But then I noticed that I could add some extra value by changing the datatype of a few of the columns. But how would that affect the data packing (and padding), and would the order of the columns change? Cause the altered columns would take up more space than the unaltered columns so something had to change but how?

So I did an experiment where I took this table below and altered the boolean columns to smallint with the intent for it so show how windy or hot it is rather than just saying that it is windy.

create table t2(
    ts timestamptz NOT NULL,
    is_hot boolean NOT NULL,
    is_windy boolean NOT NULL,
    humidity smallint NOT NULL,
    wind_direction real NOT NULL
);
-- Fill with data
alter table t2
    alter column is_hot type smallint using CASE WHEN is_hot THEN 1 ELSE 0 END;

Here's a DBfiddle

Turned out that the column order stayed the same but I was left with some padding due to wind_direction no longer being aligned to 4 bytes.

What is it that postgres does to alter all these rows as all rows needs to be updated now that the row takes up more physical space?

From Dropping column in Postgres on a large dataset I can gather that dropped columns become hidden NULLable columns until the row is updated the next time at which time the column physically is removed.

How is ALTER column type different from me creating a new table with the altered definition and inserting all the old (but transformed) data into the new table myself with SQL?

Is there some optimizations that Postgres can do because it's not me explicitly making a new table and moving the data over?

Laurenz Albe · Accepted Answer · 2024-02-28 11:38:41Z

0

If you alter the table like that, the whole table gets rewritten.

Before the change, a row would look like this:

                      timestamp  smallint
                           |        |
                           v        v
hhhhhhhhhhhhhhhhhhhhh_|dddddddd|d|d|dd|dddd
                                ^ ^      ^
                                 V       |
                             booleans   real

 h ... header byte (there are 23)
 _ ... padding byte
 d ... data byte
 | ... optical separator (0 bytes)

After the change:

                      timestamp  smallints
                           |       /   |
                           v      /    v
hhhhhhhhhhhhhhhhhhhhh_|dddddddd|dd|d_|dd__|dddd
                                   ^        ^
                                   |        |
                                boolean    real

That's because a smallint must start at an address that is a multiple of two, and a real at an address that is a multiple of four.

So you are ending up with three extra padding bytes.

answered Feb 28, 2024 at 11:38

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Hannes Kindströmmer Over a year ago

The byte alignment is explained in the question I linked stackoverflow.com/questions/2966524/… But the rewrite answers one of the questions, would you be able to expand on the two other questions I had?

Laurenz Albe Over a year ago

You should restrict yourself to one question per question. And if that was not your main question, you should have chosen a different title. A rewrite is about the same as creating a new table and copying the rows yourself, except that you don't have to take care of all the foreign keys etc., because PostgreSQL does it automatically.

Hannes Kindströmmer Over a year ago

While I understand the reasoning behind a single question per question I found that these questions are so closely related to each other that posing these three questions separately would make them considered duplicates.

Laurenz Albe Over a year ago

Then that might be an indication that you need consulting rather than a single, focused Stackoverflow answer.

Collectives™ on Stack Overflow

How does data alignment after altering columns in Postgres

1 Answer 1

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related