Byte size of numeric(precision, scale) in Postgres

Question

Given that the size is variable, I am trying to calculate the average cost of a numeric(p, s) attribute. In my case, numeric(10, 2). From what I understand of the Postgres documentation, this means I have a total of 10 digits where 2 of them are exclusive of the decimal part, hence, the maximum value I can store is 99999999,99.

What I do not understand is this part of the documentation:

The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

My two questions here are:

If I have two decimal digits, that means the storage requirement for them would be one byte?
If I have three to eight bytes overhead, how can I know which value corresponds to my case?

I ran a query with an average of pg_column_size that gave me a total of 6.95 bytes. However, I want to know if there is any way I can calculate the approximate value before checking the actual one. (Same with other variable types in Postgres)

Erwin Brandstetter · Accepted Answer · 2024-06-01 10:58:52Z

If I have two decimal digits, that means the storage requirement for them would be one byte?

No, a numeric with two digits occupies 5 bytes on disk and 8 bytes in RAM.

If I have three to eight bytes overhead, how can I know which value corresponds to my case?

As the manual says:

two bytes for each group of four decimal digits

For each started group of four decimal digits, to be precise. So we can compute:

ceil(digits / 4.0) * 2

plus three to eight bytes overhead.

The minimum overhead of 3 bytes only applies to '0' in "packed" format on disk. Else:

2 bytes overhead for any digits at all (except special case 0)
Or + 4 bytes overhead for more than 256 digits
3 bytes overhead in "packed" format, which is in use for storage on disk
Or + 6 bytes overhead when storage exceeds 128 bytes or for storage in RAM

Sign, decimal point, and white space of the number are insignificant for storage size.
Type modifiers (precision and scale) of the numeric type are also insignificant.

Here is a demo with formulas for the complete computation:

SELECT id, digits
     , pg_column_size(n) AS disk_size
     , CASE
          WHEN n = 0         THEN 3                           -- efficient storage of 0
          WHEN digits <= 248 THEN 3 + ceil(digits / 4.0) * 2  -- "packed" format for size <= 128 bytes 
          WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
          ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
       END AS expected_disk_size
     , pg_column_size(n + 0) AS ram_size
        , CASE
          WHEN n = 0         THEN 6                           -- efficient storage of 0
          WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
          ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
       END AS expected_ram_size
FROM   tbl
     , length(translate(n::text, '-. ', '')) AS digits;       -- only digits matter

fiddle

Note that the no-op n + 0 forces the stored value to be unpacked, so we get RAM size. Related:

What is the overhead for varchar(n)?

Also note that storage of null values is different. See:

Do nullable columns occupy additional space in PostgreSQL?

Collectives™ on Stack Overflow

Byte size of numeric(precision, scale) in Postgres

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related