Postgresql aggregate result dynamically by column data

Question

I have the following table stored in Postgresql

table:
=========================================================================================
ctx_id         ctx_type                  entity_id          entity_type      customer_id 
=========================================================================================
CTX_ID_1          ctx_type_A            entity_id_1           typeA             C_ID
CTX_ID_1          ctx_type_A            entity_id_2           typeB             C_ID
CTX_ID_2          ctx_type_B            entity_id_3           typeB             C_ID
CTX_ID_3          ctx_type_B            entity_id_4           typeA             C_ID
CTX_ID_4          ctx_type_A            entity_id_5           typeC             C_ID

Basically, I want to get all rows of certain customer_id, but since customer can have ~10,000 rows I would like to reduce the query response size, so instead of returning the result using a basic select statement - select ctx_id, ctx_type, entity_id, entity_type from "table" where customer_id = "C_ID"

I would like to aggregate results in postgres by ctx_id and ctx_type so db query result will look like this: That way the query response size would be reduced, since the same ctx_id and ctx_type are not returned multiple times but only once.

{
  "CTX_ID_1|ctx_type_A": [ <-- a concatenation of ctx_id | context_type
    {
      "entity_id": "entity_id_1",
      "entity_type": "type_A"
    },
    {
      "entity_id": "entity_id_1",
      "entity_type": "type_B"
    }
  ],
  "CTX_ID_2|ctx_type_B": [
    {
      "entity_id": "entity_id_3",
      "entity_type": "type_B"
    }
  ],
  "CTX_ID_3|ctx_type_B": [
    {
      "entity_id": "entity_id_4",
      "entity_type": "type_A"
    }
  ],
  "CTX_ID_4|ctx_type_A": [
    {
      "entity_id": "entity_id_5",
      "entity_type": "type_C"
    }
  ]
}

My question is - How this cab be achieved using Postgresql JSON Functions? I couldn't find anyway doing that with Use jsonb_build_array() and json_object_agg(). if not what are the alternatives?

P.S - Just to elaborate about what I mean in query data response size..

Take customer_id for example, response size without selecting the customer_id is: select ctx_id, ctx_type, entity_id, entity_type from "table" where customer_id = "C_ID"

response size with selecting the customer_id is: select ctx_id, ctx_type, entity_id, entity_type, customer_id from "table" where customer_id = "C_ID"

So what I'm trying to say, If I know that the customer_id has a lot of occurrences, instead of returning the same customer_id in each of the rows it can be returned once like this.

{
"C_ID": [...all rows here]
}

Converting a result to JSON will most likely not reduce the "response" size. 10000 rows as JSON won't take less space than 10000 "raw" rows — user330315
– user330315, Commented Sep 22, 2020 at 10:11
@a_horse_with_no_name - I ran select ctx_id, ctx_type, entity_id, entity_type, customer_id from "table" where customer_id = "C_ID" vs. the query above ehich is without the customer id, and I noticed a difference in the response size. Don't you think that less data will be transferred in the network? since byte size is smaller? — Jay
– Jay, Commented Sep 22, 2020 at 10:18
@a_horse_with_no_name I have edited my question I hope you get me right. and as always thank you a lot for your help and contribution :), much appreciated. — Jay
– Jay, Commented Sep 22, 2020 at 10:31

user330315 · Accepted Answer · 2020-09-22 10:21:05Z

Converting the entire result to a use JSON object won't really reduce the size of the result. It's just a different representation.

But anyway, you could do it like this:

select jsonb_build_object(
         concat(ctx_id, '|', ctx_type), 
           jsonb_agg(jsonb_build_object('entity_id', entity_id)||jsonb_build_object('entity_type', entity_type))
       ) 
from the_table
group by concat(ctx_id, '|', ctx_type);

This returns one JSON value per combination of (ctx_id, ctx_type). I would not convert the whole result into a single huge JSON value, as that might quickly hit the limits of a single JSON value.

If you really want to go down that road (which I would not recommend), you can do it with a derived table:

select jsonb_object_agg(id_type, entities)
from (
  select concat(ctx_id, '|', ctx_type) as id_type, 
         jsonb_agg(jsonb_build_object('entity_id', entity_id)||jsonb_build_object('entity_type', entity_type)) as entities
  from the_table
  group by concat(ctx_id, '|', ctx_type)
) t;

Mike Organek · Accepted Answer · 2020-09-22 10:22:47Z

Direct answer to your question is:

select customer_id, jsonb_object_agg(top_key, vals_array) 
  from 
  (select customer_id, concat(ctx_id, '|', ctx_type) as top_key, 
          jsonb_agg(to_jsonb(x)) as vals_array 
     from table1
          cross join lateral (select entity_id, entity_type) as x
    group by customer_id, top_key) o
 group by customer_id;

If you are concerned about the size of the return value, then use an array instead of an object for the innermost part:

select customer_id, jsonb_object_agg(top_key, vals_array) 
  from 
  (select customer_id, concat(ctx_id, '|', ctx_type) as top_key, 
          jsonb_agg(jsonb_build_array(entity_id, entity_type)) as vals_array 
     from table1
    group by customer_id, top_key) o
 group by customer_id;

Working fiddle.

Collectives™ on Stack Overflow

Postgresql aggregate result dynamically by column data

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related