1

my SQLite database is kinda like this:

startn rel endn
immunodeficiencies includes B-cell deficiencies
B-cell deficiencies includes Bruton agammaglobulinemia
immunodeficiencies includes T-cell deficiencies
T-cell deficiencies includes DiGeorge syndrome

There's a lot more values, but that's the gist.

I'm trying to make a recursive query that gets all the terminal child nodes of immunodeficiencies. Ie, for this example I'd like Bruton agammaglobulinemia, DiGeorge syndrome

Based on another answer, here's what I have so far, but it's not working:

with recursive tc( i )
  as ( select endn from dx_graph where startn = 'immunodeficiencies' and rel = 'includes'
        union endn from dx_graph, tc where startn = tc.i and rel = 'includes'
     )
 select * from tc;

sample data:

CREATE TABLE "dx_graph" (
    "startn"    TEXT,
    "rel"   TEXT,
    "endn"  TEXT,
)

insert into dx_graph values ('immunodeficiencies', 'includes', 'B-cell deficiencies')
insert into dx_graph values ('immunodeficiencies', 'includes', 'T-cell deficiencies')
insert into dx_graph values ('immunodeficiencies', 'includes', 'complement disorders')
insert into dx_graph values ('B-cell deficiencies', 'includes', 'Bruton agammaglobulinemia')
insert into dx_graph values ('T-cell deficiencies', 'includes', 'DiGeorge syndrome')
insert into dx_graph values ('complement disorders', 'includes', 'C1 esterase inhibitor deficiency')
insert into dx_graph values ('complement disorders', 'includes', 'C5-C9 deficiency')
4
  • I know that what I really want is a graph database, but honestly I tried Neo4J and it's inserting new values was just super clunky. Commented Jan 17, 2021 at 1:46
  • Please define what you mean by "child node". Desired results would help. Commented Jan 17, 2021 at 1:52
  • Just added it, thanks Commented Jan 17, 2021 at 1:55
  • Please provide sample data as plain text, preferably a create table and insert statements for easy importing into a test database. Commented Jan 17, 2021 at 2:05

1 Answer 1

1

Your recursive query will find all child nodes of the original specified parent; you then need to filter them based on whether they have any child nodes or not. You can do this with a NOT EXISTS clause:

WITH RECURSIVE tc AS (
  SELECT endn
  FROM dx_graph 
  WHERE startn = 'immunodeficiencies' AND rel = 'includes'
  UNION ALL
  SELECT dx.endn
  FROM dx_graph dx
  JOIN tc ON dx.startn = tc.endn AND dx.rel = 'includes'
)
SELECT endn
FROM tc
WHERE NOT EXISTS (
  SELECT *
  FROM dx_graph dx
  WHERE dx.startn = tc.endn AND dx.rel = 'includes'
)

Output (for sample data in question):

endn
Bruton agammaglobulinemia
DiGeorge syndrome
C1 esterase inhibitor deficiency
C5-C9 deficiency

Demo on db-fiddle

The NOT EXISTS clause filters out any nodes from the recursive CTE that have a child node (i.e. the endn value from the CTE matches a startn value in the original table that has a rel='includes'). That way you only get the terminal nodes, as they have no children.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.