3

tl;dr -- is there some way to get values as a jsonb_array from a jsonb object in postgres?


Trying to use recursive cte in postgres to flatten an arbitrarily deep tree structure like this:

{
  "type": "foo",
  "properties": {...},
  "children": [
    "type": "bar",
    "properties": {...},
    "children": [
      "type": "multivariate",
      "variants": {
        "arbitrary-name": {
          properties: {...},
          children: [...],
        },
        "some-other-name": {
          properties: {...},
          children: [...],
        },
        "another": {
          properties: {...},
          children: [...],
        }
      }
    ]
  ]
}

Generally following this post, however I'm stuck at processing the type: "multivariate" node where what I really want is essentially jsonb_agg(jsonb_object_values(json_object -> 'variants'))

Update:

Sorry, I obviously should have included the query I tried:

WITH RECURSIVE tree_nodes (id, json_element) AS (
  -- non recursive term
  SELECT
    id, node
  FROM trees

  UNION

  -- recursive term
  SELECT
    id,
    CASE
    WHEN jsonb_typeof(json_element) = 'array'
      THEN jsonb_array_elements(json_element)
    WHEN jsonb_exists(json_element, 'children')
      THEN jsonb_array_elements(json_element -> 'children')
    WHEN jsonb_exists(json_element, 'variants')
      THEN (select jsonb_agg(element.value) from jsonb_each(json_element -> 'variants') as element)
    END AS json_element
  FROM
    tree_nodes
  WHERE
    jsonb_typeof(json_element) = 'array' OR jsonb_typeof(json_element) = 'object'
)

select * from tree_nodes;

The schema is just an id & a jsonb node column

This query gives an error:

ERROR:  set-returning functions are not allowed in CASE
LINE 16:       THEN jsonb_array_elements(json_element -> 'children')
                    ^
HINT:  You might be able to move the set-returning function into a LATERAL FROM item.

I just want Object.values(json_element -> 'variants') 😫

Update 2:

After reading this all again, I realized this is a problem due to me using a recent version of PostgreSQL (10.3), which apparently no longer allows returning a set from a CASE statement, which was kind of the crux of getting this tree-flattening approach to work afaict. There's probably some way to achieve the same thing in recent versions of PostgreSQL but I have no idea how I'd go about doing it.

11
  • Post some schema and a query that you’ve tried if you can. Commented Aug 21, 2018 at 2:01
  • You want the three sub elements of 'variants' in an array? Or do want the 'property'/'children' pairs in an array? Commented Aug 21, 2018 at 4:50
  • Geez now that I'm looking at it again after eating dinner I realize it's not variants causing the problem, it's the array of children. Is the example in the article I listed simply not possible in Postgres 10?? Commented Aug 21, 2018 at 6:08
  • @S-Man I want the nodes flattened so the return would have columns tree_id & node columns (node being a jsonb object with type, properties & children properties) Commented Aug 21, 2018 at 6:10
  • @Steve Please provide a sample of expected output. Which part should be flattened and what happens with the variants? Commented Aug 21, 2018 at 7:01

2 Answers 2

1

Use jsonb_each() in the FROM clause together with jsonb_agg(<jsonb_each_alias>.value) in the SELECT, for example:

select
    id,
    jsonb_agg(child.value)
from
    (values
      (101, '{"child":{"a":1,"b":2}}'::jsonb),
      (102, '{"child":{"c":3,"d":4,"e":5}}'::jsonb
    )) as t(id, json_object), -- example table, replace values block with actual tablespec
    jsonb_each(t.json_object->'child') as child
group by t.id

You can always chain other jsonb functions which return setof jsonb (e.g. jsonb_array_elements) in the FROM if you need to iterate the higher level arrays before the jsonb_each; for example:

select
    id,
    jsonb_agg(sets.value)
from
    (values
      (101, '{"children":[{"a_key":{"a":1}},{"a_key":{"b":2}}]}'::jsonb),
      (102, '{"children":[{"a_key":{"c":3,"d":4,"e":5}},{"a_key":{"f":6}}]}'::jsonb
    )) as t(id, json_object), -- example table, replace values block with actual tablespec
    jsonb_array_elements(t.json_object->'children') elem,
    jsonb_each(elem->'a_key') as sets
group by t.id;

Update Answer

In answer to your comment and question edit about needing to walk the 'children' of each tree node and extract the 'variants'; I would achieve this by splitting the CTE into multiple stages:

with recursive
  -- Constant table for demonstration purposes only; remove this and replace below references to "objects" with table name
  objects(id, object) as (values
    (101, '{"children":[{"children":[{"variants":{"aa":11}},{"variants":{"ab":12}}],"variants":{"a":1}},{"variants":{"b":2}}]}'::jsonb),
    (102, '{"children":[{"children":[{"variants":{"cc":33,"cd":34,"ce":35}},{"variants":{"f":36}}],"variants":{"c":3,"d":4,"e":5}},{"variants":{"f":6}}]}'::jsonb)
  ),
  tree_nodes as ( -- Flatten the tree by walking all 'children' and creating a separate record for each root
    -- non-recursive term: get root element
    select
      o.id, o.object as value
    from
      objects o
    union all
    -- recursive term - get JSON object node for each child
    select
      n.id,
      e.value
    from
      tree_nodes n,
      jsonb_array_elements(n.value->'children') e
    where
      jsonb_typeof(n.value->'children') = 'array'
  ),
  variants as (
    select
      n.id,
      v.value
    from
      tree_nodes n,
      jsonb_each(n.value->'variants') v -- expand variants
    where
      jsonb_typeof(n.value->'variants') = 'object'
  )
select
  id,
  jsonb_agg(value)
from
  variants
group by
  id
;

This ability of breaking a query up into a "pipeline" of operations is one of my favourite things about CTEs - it makes the query much easier to understand, maintain and debug.

Sign up to request clarification or add additional context in comments.

1 Comment

This is really cool! And kind of blowing my mind... I didn't realize you could chain jsonb functions like that in the FROM clause. However I don't quite see how to apply this to my problem. Objects in my json have a children array or named children in a variants hash. I think I kind of want to do (pseudocode) SELECT id, (json.children || json.variants.values)
1

db<>fiddle

Expanded the test data with more children elements and deeper structure (more nested elements):

{
    "type": "foo", 
    "children": [
        {
            "type" : "bar1", 
            "children" : [{
                "type" : "blubb",
                "children" : [{
                    "type" : "multivariate",
                    "variants" : {
                        "blubb_variant1": {
                            "properties" : {
                                "blubb_v1_a" : 100
                            },
                            "children" : ["z", "y"]
                        },
                        "blubb_variant2": {
                            "properties" : {
                                "blubb_v2_a" : 300,
                                "blubb_v2_b" : 4200
                            },
                            "children" : []
                        }
                    }
                }]
            }]
        },
        {
            "type" : "bar2", 
            "children" : [{
                "type" : "multivariate",
                "variants" : {
                    "multivariate_variant1": {
                        "properties" : {
                            "multivariate_v1_a" : 1,
                            "multivariate_v1_b" : 2
                        },
                        "children" : [1,2,3]
                    },
                    "multivariate_variant2": {
                        "properties" : {
                            "multivariate_v2_a" : 3,
                            "multivariate_v2_b" : 42,
                            "multivariate_v2_d" : "fgh"
                        },
                        "children" : [4,5,6]
                    },
                    "multivariate_variant3": {
                        "properties" : {
                            "multivariate_v3_a" : "abc",
                            "multivariate_v3_b" : "def"
                        },
                        "children" : [7,8,9]
                    }
                }
            },
            {
                "type" : "blah",
                "variants" : {
                    "blah_variant1": {
                        "properties" : {
                            "blah_v1_a" : 1,
                            "blah_v1_b" : 2
                        },
                        "children" : [{
                            "type" : "blah_sub1",
                            "variants" : {
                                "blah_sub1_variant1" : {
                                    "properties" : {
                                        "blah_s1_v1_a" : 12345,
                                        "children" : ["a",1, "bn"]
                                    }
                                }
                            }
                        }]
                    },
                    "blah_variant2": {
                        "properties" : {
                            "blah_v2_a" : 3,
                            "blah_v2_b" : 42,
                            "blah_v2_c" : "fgh"
                        },
                        "children" : [4,5,6]
                    }
                }
            }]
        }
    ]
}

Result:

variants                 json                                                                                                                                                                                            
-----------------------  ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
"multivariate_variant1"  {"children": [1, 2, 3], "properties": {"multivariate_v1_a": 1, "multivariate_v1_b": 2}}                                                                                                         
"multivariate_variant2"  {"children": [4, 5, 6], "properties": {"multivariate_v2_a": 3, "multivariate_v2_b": 42, "multivariate_v2_d": "fgh"}}                                                                            
"multivariate_variant3"  {"children": [7, 8, 9], "properties": {"multivariate_v3_a": "abc", "multivariate_v3_b": "def"}}                                                                                                 
"blah_variant1"          {"children": [{"type": "blah_sub1", "variants": {"blah_sub1_variant1": {"properties": {"children": ["a", 1, "bn"], "blah_s1_v1_a": 12345}}}}], "properties": {"blah_v1_a": 1, "blah_v1_b": 2}}  
"blah_variant2"          {"children": [4, 5, 6], "properties": {"blah_v2_a": 3, "blah_v2_b": 42, "blah_v2_c": "fgh"}}                                                                                                    
"blubb_variant1"         {"children": ["z", "y"], "properties": {"blubb_v1_a": 100}}                                                                                                                                     
"blubb_variant2"         {"children": [], "properties": {"blubb_v2_a": 300, "blubb_v2_b": 4200}}                                                                                                                         
"blah_sub1_variant1"     {"properties": {"children": ["a", 1, "bn"], "blah_s1_v1_a": 12345}}   

The Query:

WITH RECURSIVE json_cte(variants, json) AS (
    SELECT NULL::jsonb, json FROM (
        SELECT '{/*FOR TEST DATA SEE ABOVE*/}'::jsonb as json
    )s
    
    UNION
    
    SELECT  
        row_to_json(v)::jsonb -> 'key',                                -- D        
        CASE WHEN v IS NOT NULL THEN row_to_json(v)::jsonb -> 'value' ELSE c END  -- C
    FROM json_cte
         LEFT JOIN LATERAL jsonb_array_elements(json -> 'children') as c ON TRUE  -- A
         LEFT JOIN LATERAL jsonb_each(json -> 'variants') as v ON TRUE -- B
)
SELECT * FROM json_cte WHERE variants IS NOT NULL

The WITH RECURSIVE structure checks elements in a recursive ways. The first UNION part is the starting point. The second part is the recursive part where the last calculation is taken for the next step.

A: if in the current JSON a children element exists all elements will be expanded into one row per child

B: if the current JSON has an element variants all elements will be expanded into a record. Note that in the example one JSON element can either contain a variants or a children element.

C: if there is a variants element then the expanded record will be converted back into a json. The resulting structure is {"key" : "name_of_variant", "value" : "json_of_variant"}. The values will be the JSON for the next recursion (the JSON of the variants can have own children elements. That's why it works). Otherwise the expanded children elements will be the next data

D: if there is a variants element then the key is printed

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.