1

I have a table containing an id and an array of related id

 id  | related_id  
-----+-------------
 712 | {1116}
 837 | {1116,1127}
1116 | {712,837}
1127 | {837}

I want to group / merge all related id having a common element

 id  | related_id  |      full_ids       
-----+-------------+---------------------
 712 | {1116}      | {712,1116,837,1127}
 837 | {1116,1127} | {837,1116,712,1127}
1116 | {712,837}   | {712,1116,837,1127}
1127 | {837}       | {1127,837,1116,712}

I try to do it with a recursive SQL but for now some row have missing element

 WITH RECURSIVE 
 un AS (SELECT id,unnest(related_id) as parent FROM my_table),
 rec (id) as
 (
   SELECT id,parent,array[id] as ids from un
     UNION ALL
   SELECT rec.id,un.parent,ids || un.id FROM rec JOIN un ON rec.parent = un.id where un.id <> all(ids)
 ),
 j AS (SELECT distinct on (id) ids,id FROM rec order by id, cardinality(ids) desc)                                                           
 SELECT ops.*,j.ids as full_ids FROM my_table ops LEFT JOIN j ON ops.id=j.id;

Result of this query

  id  | related_id  |      full_ids       
------+-------------+---------------------
  712 | {1116}      | {712,1116,837,1127}
  837 | {1116,1127} | {837,1116,712}
 1116 | {712,837}   | {1116,837,1127}
 1127 | {837}       | {1127,837,1116,712}

What I am missing ?

1 Answer 1

1

I propose you the following solution :

WITH RECURSIVE list (id, related_id, parent, full_ids) AS
(
SELECT id, related_id, related_id, id || related_id
  FROM my_table
UNION ALL
SELECT l.id, l.related_id, array_remove (l.parent, t.id) || t.related_id, l.full_ids || t.sub_id
  FROM list AS l
 INNER JOIN
     ( my_table AS m
       CROSS JOIN LATERAL unnest(m.related_id) AS sub_id
     ) AS t
    ON array[t.id] <@ l.parent
   AND NOT array[t.sub_id] <@ l.full_ids
)
SELECT DISTINCT ON (id) id, related_id, full_ids
  FROM list
 ORDER BY id, array_length(full_ids, 1) DESC

result :

id      related_id      full_ids
712     {1116}          {712,1116,837,1127}
837     {1116,1127}     {837,1116,1127,712}
1116    {712,837}       {1116,712,837,1127}
1127    {837}           {1127,837,1116,712}

See the test in dbfiddle

Sign up to request clarification or add additional context in comments.

2 Comments

Your results are exactly what I needs. Thanks The query take much more time can we improve it ?
Comparing the result of EXPLAIN ANALYSE applied to your query and my query, we can see in dbfiddle that my query should be about 5 time faster than yours (total cost = 4.6 versus 22). Now your data model with related_id of type integer[] is not good for the performances ... You can try to create a GiST or GIN index on the related_id column and see if it acccelerates the query.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.