2

I have a 4-level hierarchy that I would like to retrieve rolled up to the top level.

Lets say you have Class -> Order -> Family -> Species.

I am looking for the following output:

{ 
  classes: [{
    id: 1,
    name: "Class A",
    orders: [{
      id: 1,
      name: "Class A - Order 1",
      families: [{
        id: 1,
        name: "Class A - Order 1 - Family I"
        species: [{
          id: 1,
          name: "Class A - Order 1 - Family I - Species 1"
        }]
      }]
    }]
  }]
}

It's easy enough to get the data using a massive join aka

SELECT classes.id as class_id, orders.id as order_id, families.id as family_id, species.id as species_id
FROM species
JOIN families ON families.id = species.family_id
JOIN orders ON orders.id = families.order_id
JOIN classes ON classes.id = orders.class_id

But that gives flat table structure, not the rolled up one I'm looking for.

class_id order_id family_id species_id
1        1        1         1
1        1        1         2
1        1        1         3

I tried using LATERAL joins which are sub-queries evaluated in context. So something along the lines:

SELECT classes.id, array_agg(orders_sub.id) as orders
FROM classes,
LATERAL (
  SELECT orders.id
  FROM orders
  WHERE classes.id = orders.class_id
) AS orders_sub
group by classes.id;

which produces:

id  orders
1   {1,2}

But I'm having trouble getting down multiple levels and rolling up whole records.

Bonus: If we can eliminate elements with empty relations, e.g. families without any species that'd be great.

Background: This is a reporting API and so far we have been serializing Rails ActiveRecord objects, which is obviously very slow for large amounts of data (100k-1M range I think usually). So I'd love leverage the JSON functionality that Postgres offers

5
  • Please provide your expected output Commented Jun 15, 2018 at 17:40
  • @DanielMarcus it's right at the top. the nested JSON Commented Jun 15, 2018 at 17:42
  • Can you show in table form? You are looking for a sql query right? Commented Jun 15, 2018 at 17:43
  • SQL alone will give you tabular output, not the sort of structured JSON text in your illustration. Consider using a template language such as Jinja or Airspeed to process the SQL summary into the final format you're looking for. Commented Jun 15, 2018 at 17:44
  • using json and aggregation function from postgres you can definitely get this structured JSON output. for a simple example see hashrocket.com/blog/posts/… Commented Jun 15, 2018 at 17:51

1 Answer 1

2

Json is the only reasonable format for the desired structured output. You have to build a hierarchical query to get a hierarchical structure as a result.

select
    jsonb_agg(jsonb_build_object(
        'id', id, 
        'class', name, 
        'orders', orders)
        order by id
    ) as classes
from classes
join (
    select
        class_id,
        jsonb_agg(jsonb_build_object(
            'id', id, 
            'order', name, 
            'families', families)
            order by id
        ) as orders
    from orders
    join (
        select 
            order_id, 
            jsonb_agg(jsonb_build_object(
                'id', id, 
                'family', name,
                'species', species)
                order by id
            ) as families
        from families
        join (
            select 
                family_id, 
                jsonb_agg(jsonb_build_object(
                    'id', id, 
                    'species', name)
                    order by id
                ) as species
            from species
            group by family_id
            ) s on id = family_id
        group by order_id
        ) f on id = order_id
    group by class_id
    ) o on id = class_id

See the demo on exemplary data: DbFiddle.

Sign up to request clarification or add additional context in comments.

4 Comments

awesome, I'll give that a try with my actual code. Do you know much about LATERAL joins and the performance compared to this join/group by query?
this query work great! I have one additional requirement @klin I have to implement some permission that translate into the following conditions WHERE class.id IN (?) OR class.id IN (?) AND (species.field_1 IN (?) OR species.field_2 IN (?) ) I'm getting the error: invalid reference to FROM-clause entry for table "classes" HINT: There is an entry for table "classes", but it cannot be referenced from this part of the query. when adding this condition to the SELECT from species sub-query
Because the query consists of four nested levels, you can add a condition for a class only on first level, for an order only on the second level, etc. See the example with conditions added.
I ended up throwing in another nested query at the "species" level. I just join all tables together and have my where clause on the "species" and "classes" table and use "WHERE id IN (SELECT ...)" Probably not the most performance but good enough for now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.