1

Given that the size is variable, I am trying to calculate the average cost of a numeric(p, s) attribute. In my case, numeric(10, 2). From what I understand of the Postgres documentation, this means I have a total of 10 digits where 2 of them are exclusive of the decimal part, hence, the maximum value I can store is 99999999,99.

What I do not understand is this part of the documentation:

The actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead.

My two questions here are:

  • If I have two decimal digits, that means the storage requirement for them would be one byte?
  • If I have three to eight bytes overhead, how can I know which value corresponds to my case?

I ran a query with an average of pg_column_size that gave me a total of 6.95 bytes. However, I want to know if there is any way I can calculate the approximate value before checking the actual one. (Same with other variable types in Postgres)

0

1 Answer 1

1

If I have two decimal digits, that means the storage requirement for them would be one byte?

No, a numeric with two digits occupies 5 bytes on disk and 8 bytes in RAM.

If I have three to eight bytes overhead, how can I know which value corresponds to my case?

As the manual says:

two bytes for each group of four decimal digits

For each started group of four decimal digits, to be precise. So we can compute:

ceil(digits / 4.0) * 2

plus three to eight bytes overhead.

The minimum overhead of 3 bytes only applies to '0' in "packed" format on disk. Else:

  • 2 bytes overhead for any digits at all (except special case 0)
    Or + 4 bytes overhead for more than 256 digits

  • 3 bytes overhead in "packed" format, which is in use for storage on disk
    Or + 6 bytes overhead when storage exceeds 128 bytes or for storage in RAM

Sign, decimal point, and white space of the number are insignificant for storage size.
Type modifiers (precision and scale) of the numeric type are also insignificant.

Here is a demo with formulas for the complete computation:

SELECT id, digits
     , pg_column_size(n) AS disk_size
     , CASE
          WHEN n = 0         THEN 3                           -- efficient storage of 0
          WHEN digits <= 248 THEN 3 + ceil(digits / 4.0) * 2  -- "packed" format for size <= 128 bytes 
          WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
          ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
       END AS expected_disk_size
     , pg_column_size(n + 0) AS ram_size
        , CASE
          WHEN n = 0         THEN 6                           -- efficient storage of 0
          WHEN digits <= 256 THEN 6 + ceil(digits / 4.0) * 2  -- effecitve size > 128 bytes
          ELSE                    8 + ceil(digits / 4.0) * 2  -- more than 256 digits
       END AS expected_ram_size
FROM   tbl
     , length(translate(n::text, '-. ', '')) AS digits;       -- only digits matter

fiddle

Note that the no-op n + 0 forces the stored value to be unpacked, so we get RAM size. Related:

Also note that storage of null values is different. See:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.