1

I'm trying to get the column size for each row in a table. That's basically the combination of these two queries:

SELECT pg_size_pretty(sum(pg_column_size(COLUMN_NAME))) FROM TABLE_NAME;

And

SELECT column_name FROM information_schema.columns WHERE table_schema = 'public' AND table_name = 'TABLE_NAME';

My first attempt was to do these two queries:

 => SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME';
ERROR:  column "columns.column_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_siz...
               ^
 => SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME' GROUP BY column_name;
ERROR:  more than one row returned by a subquery used as an expression

Tried the following too:

SELECT column_name, (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM TABLE_NAME)  FROM information_schema.columns WHERE table_schema = 'public' AND table_name   = 'TABLE_NAME' GROUP BY 1;

Which returned:

ERROR:  more than one row returned by a subquery used as an expression

When I add a LIMIT 1, the result is incorrect:

SELECT column_name, 
   (SELECT pg_size_pretty(sum(pg_column_size(column_name))) FROM main_apirequest LIMIT 1)
   FROM information_schema.columns 
   WHERE table_schema = 'public' AND table_name   = 'main_apirequest'
   GROUP BY 1;

It looks something like this:

   column_name    | pg_size_pretty
------------------+----------------
 api_key_id       | 11 bytes
 id               | 3 bytes
...

When it should be something like this (which doesn't happen because of the limit 1)

=> SELECT pg_size_pretty(sum(pg_column_size(id))) FROM main_apirequest
;
 pg_size_pretty
----------------
 19 MB
2
  • 1
    Since you don't know the column names in advance, you will have to use dynamic sql. Commented Nov 24, 2018 at 19:48
  • Ok, cool. Will look into that Commented Nov 24, 2018 at 19:51

1 Answer 1

1

Since you don't know the columns names in advance, but want to use the column name in the query, you'll have to use dynamic sql. Here's a quick example:

CREATE TABLE t1 (id INTEGER, txt TEXT);

INSERT INTO t1
SELECT g, random()::TEXT
FROM generate_series(1, 10) g;

Then the SQL to generate the query is:

DO $$
DECLARE
        query TEXT;
BEGIN
        SELECT 'SELECT ' || STRING_AGG(FORMAT('sum(pg_column_size(%1$I)) AS %1$s', column_name), ', ') || ' FROM t1'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = 'public'
        AND table_name = 't1';

        RAISE NOTICE '%', query;
END $$

The query created is SELECT pg_size_pretty(sum(pg_column_size(id))) AS id, pg_size_pretty(sum(pg_column_size(txt))) AS txt FROM t1

Would work the same if you had hundreds of columns.

Now to get it to generate and run the query and return you results, it really depends on what you want. If you're happy just having it print to the screen, then maybe you can format it like this instead:

DO $$
DECLARE
        query TEXT;
        result TEXT;
BEGIN
        SELECT 'SELECT CONCAT_WS(E''\n'', ' || STRING_AGG(FORMAT('''%1$s: '' || pg_size_pretty(sum(pg_column_size(%1$I)))', column_name), ', ') || ') FROM t1'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = 'public'
        AND table_name = 't1';

        EXECUTE query
        INTO result;

        RAISE NOTICE '%', result;
END $$

That prints:

id: 40 bytes
txt: 181 bytes

If instead you want a record returned with multiple columns, I'm not too sure how you'd go about it because the number of columns and their names would be unknown. Best hack I can think around it would be to return it as JSON, then you return just one thing and there will be a variable number of fields in there with whatever column names:

CREATE OR REPLACE FUNCTION test1(_schema_name TEXT, _table_name TEXT)
        RETURNS JSON AS
$$
DECLARE
        query TEXT;
        result JSON;
BEGIN
        SELECT 'SELECT ROW_TO_JSON(cols) FROM (SELECT ' || STRING_AGG(FORMAT('pg_size_pretty(sum(pg_column_size(%1$I))) AS %1$s', column_name), ', ') || ' FROM t1) AS cols'
            INTO query
        FROM information_schema.columns
        WHERE table_schema = _schema_name
        AND table_name = _table_name;

        EXECUTE query
        INTO result;

        RETURN result;
END
$$
        LANGUAGE plpgsql;

Running it: SELECT test1('public', 't1')

Returns: {"id":"40 bytes","txt":"181 bytes"}

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.