2

Using postgres version 10.13

This is my datatable jsongraphs

id jsongraph
1 { "data": {"scopes_by_id": { "121": { "id": 121, "pk": 121, "name": "Prework" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}
2 { "data": {"scopes_by_id": { "156": { "id": 156, "pk": 156, "name": "ABC" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}

I want the distinct values of scope id and site id which should be (121, 123), (156,123)

So I tried

SELECT DISTINCT
            jsongraph->'data'->'scopes_by_id'->>'pk' ,
            jsongraph->'data'->'commonsites_by_id'->>'pk' from jsongraphs;

This won't work because the path should be like data->scopes_by_id->121->>pk but I cannot know beforehand the value of 121 in between.

Is there a way to get the values of what I need by filling in some kind of wildcard in the path?

E.g.data->scopes_by_id->{*}->>pk like that?

ANd because this is legacy data, it's also hard to change the data itself.

8
  • Which Postgres version are you using? Commented Jul 7, 2021 at 13:02
  • just updated the question. thanks for reminder. just to reiterate its 10.13 Commented Jul 7, 2021 at 13:03
  • This will be quite complicated with Postgres 10, but is quite easy with 12 (or later) due to the introduction of SQL/JSON path queries. Is the nesting level fixed or can this contain multiple levels? Commented Jul 7, 2021 at 13:06
  • oh... i see. i guess i have to upgrade to postgres 12. I don't quite get the question about nesting level. Can help me understsand with an example? Commented Jul 7, 2021 at 13:10
  • Can the key like scopes_by_id contain another JSON object that also contains scope_by_id? Commented Jul 7, 2021 at 13:16

1 Answer 1

2

As the nesting level seems to be fixed, you could do something like this:

select j.id, scopes.*, commonsites.*
from jsongraphs j
  cross join lateral (
     select jsonb_agg(j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk']) as scope_ids
     from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
  ) scopes
  cross join lateral ( 
     select jsonb_agg(j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk']) as common_ids
     from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
  ) commonsites
order by id;

The sub-queries extract all key below the respective part (e.g. scopes_by_id) and then uses the #>' operator to access the path for each id inside the original JSON value. And finally all PK values are aggregated back into a single array.

This returns the PK values from each part separately as an array in order to handle the situation where you have a different number of "scope ids" and "commonsite ids"

If you just want "the first" id from each section, you can remove the aggregation and use a LIMIT clause:

select j.id, scopes.*, commonsites.*
from jsongraphs j
  cross join lateral (
     select j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk'] as scope_id
     from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
     limit 1
  ) scopes
  cross join lateral ( 
     select j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk'] as common_id
     from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
     limit 1
  ) commonsites
order by id;

Not sure on which level you want to apply the "distinct" part for this.


In Postgres 12 or later, you could achieve the same with:

select id, 
       jsonb_path_query_array(j.jsongraph, 'strict $.data.scopes_by_id.**.pk') as scopes,
       jsonb_path_query_array(j.jsongraph, 'strict $.data.commonsites_by_id.**.pk') as common
from jsongraphs ;
order by id;

Online example

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. the postgres 10 looks really complex. I have to spend some time to move the production database to 13 first. Thank you. I will take some time before i get back to you. :)
@KimStacks if you just want to take "the first" id of each, just use a limit 1 in the sub-queries. Then you can also get rid of the aggregation done using jsonb_agg()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.