0

We are using ancestry gem in our rails project. There are about ~800 categories in the table:

db => SELECT id, ancestry FROM product_categories LIMIT 10;

id  |  ancestry
-----+-------------
399 | 3
298 | 8/292/294
 12 | 3/401/255
573 | 349/572
707 | 7/23/89/147
201 | 166/191
729 | 5/727
 84 | 7/23
128 | 7/41/105
405 | 339

(10 rows)

ancestry field represents "path" of record. What I need is to build a map { category_id => [... all_subtree_ids ... ]}

I solved this by using subqueries like this:

SELECT id, 
  (
    SELECT array_agg(id)
    FROM product_categories
    WHERE (ancestry LIKE CONCAT(p.id, '/%') OR
           ancestry = CONCAT(p.ancestry, '/', p.id, '') OR
           ancestry = (p.id) :: TEXT)
  ) categories
FROM product_categories p
ORDER BY id

which results in

1 | {17,470,32,29,15,836,845,837}
2 | {37,233,231,205,107,109,57,108,28,58, ...}

BUT the problem is this query runs about 100ms and I wonder if there's a way to optimize it using WITH recursive? I'm novice in WITH so my queries just hang the postgres :(

** ========= UPD ========= ** accepted AlexM answer as fastest, but if any one interested, here's recursive solution:

WITH RECURSIVE a AS
(SELECT id, id as parent_id FROM product_categories
 UNION all
 SELECT pc.id, a.parent_id FROM product_categories pc, a
 WHERE regexp_replace(pc.ancestry, '^(\d{1,}/)*', '')::integer = a.id)

SELECT parent_id, sort(array_agg(id)) as children FROM a WHERE id <> parent_id group by parent_id order by parent_id;

2 Answers 2

1

Try this approach, I think it should be much faster than nested queries:

WITH product_categories_flat AS (
    SELECT id, unnest(string_to_array(ancestry, '/')) as parent
    FROM product_categories
)
SELECT parent as id, array_agg(id) as children
FROM product_categories_flat
GROUP BY parent
Sign up to request clarification or add additional context in comments.

Comments

0

Odds are that a join is faster:

SELECT p1.id, 
       p2.array_agg(id)
FROM product_categories p
   JOIN product_categories p2
      ON p2.ancestry LIKE CONCAT(p1.id, '/%')
         OR p2.ancestry = CONCAT(p1.ancestry, '/', p1.id)
         OR p2.ancestry = p1.id::text)
GROUP BY p1.id
ORDER BY p1.id;

But to say something definite, you'd have to look at EXPLAIN (ANALYZE, BUFFERS) output.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.