0

I have a Postgres query that returns values for the following. This query returns one data for the id.

select h ->> 'dw:address_line_1' as address, h ->> 'dw:city' as city, h -> 'dw:region' -> 'dw:ID' -> 2 -> '#text' as state, h ->'dw:Country_ISO_Code' as country
from testschema.dw_data_job t, json_array_elements(t.addresses::json) h
where t.id = '12345'
and h -> 'dw:address_type' -> 'dw:ID' -> 1 ->> '#text' = 'BUSINESS'

Now I want to get BUSINESS address for entire population in the table dw_data_job. I changed query to as follows.

select h ->> 'dw:address_line_1' as address, h ->> 'dw:city' as city, h -> 'dw:region' -> 'dw:ID' -> 2 -> '#text' as state, h ->'dw:Country_ISO_Code' as country
from testschema.dw_data_job t, testschema.dw_data_demo t1, json_array_elements(t.addresses::json) h
where t.id = t1.id
and h -> 'dw:address_type' -> 'dw:ID' -> 1 ->> '#text' = 'BUSINESS' 

Now I get error as follows:

ERROR:  cannot call json_array_elements on a non-array
CONTEXT:  parallel worker
SQL state: 22023

The type of column dw_data_job.addresses is text and data looks like this - when the JSON value is an array:

[{"dw:address_line_1": "123 Inter Pkwy", "dw:city": "Richardson", "dw:region": {"@dw:Descriptor": "Texas", "dw:ID": [{"@dw:type": "WID", "#text": "fc77e3a1ab36487f9646d14f7242dd77"}, {"@dw:type": "Country_Region_ID", "#text": "USA-TX"}, {"@dw:type": "ISO_3166-2_Code", "#text": "TX"}]}, "dw:region_subdivision_1": "Dallas", "dw:postal_code": "75081", "dw:Country_ISO_Code": "USA", "dw:address_type": {"@dw:Descriptor": "Business", "dw:ID": [{"@dw:type": "WID", "#text": "4fae289a7fe541b098ca9448e462ff6b"}, {"@dw:type": "Communication_Usage_Type_ID", "#text": "BUSINESS"}]}, "dw:primary": "1"}, {"dw:address_line_1": "567 South Dr", "dw:city": "Plano", "dw:region": {"@dw:Descriptor": "Texas", "dw:ID": [{"@dw:type": "WID", "#text": "fc77e3a1ab36487f9646d14f7242dd77"}, {"@dw:type": "Country_Region_ID", "#text": "USA-TX"}, {"@dw:type": "ISO_3166-2_Code", "#text": "TX"}]}, "dw:region_subdivision_1": "Collin", "dw:postal_code": "75024", "dw:Country_ISO_Code": "USA", "dw:address_type": {"@dw:Descriptor": "Home", "dw:ID": [{"@dw:type": "WID", "#text": "836cf00ef5974ac08b786079866c946f"}, {"@dw:type": "Communication_Usage_Type_ID", "#text": "HOME"}]}, "dw:primary": "1"}, {"dw:address_line_1": "789 North Dr.", "dw:city": "Plano", "dw:region": {"@dw:Descriptor": "Texas", "dw:ID": [{"@dw:type": "WID", "#text": "fc77e3a1ab36487f9646d14f7242dd77"}, {"@dw:type": "Country_Region_ID", "#text": "USA-TX"}, {"@dw:type": "ISO_3166-2_Code", "#text": "TX"}]}, "dw:region_subdivision_1": "Collin", "dw:postal_code": "75024", "dw:Country_ISO_Code": "USA", "dw:address_type": {"@dw:Descriptor": "Home-Vac", "dw:ID": [{"@dw:type": "WID", "#text": "836cf00ef5974ac08b786079866c946f"}, {"@dw:type": "Communication_Usage_Type_ID", "#text": "HOME-VAC"}]}, "dw:communication_usage_behavior": {"@dw:Descriptor": "Mailing", "dw:ID": [{"@dw:type": "WID", "#text": "bea4505497c901ea53792e2628077617"}, {"@dw:type": "Communication_Usage_Behavior_Tenanted_ID", "#text": "MAILING"}]}, "dw:primary": "0"}]

Another sample value with a non-array (object):

{"dw:address_line_1": "123 Local Pkwy", "dw:city": "Cary", "dw:region": {"@dw:Descriptor": "North Carolina", "dw:ID": [{"@dw:type": "WID", "#text": "1486a0a4a8464c3b9ec482d4038deb99"}, {"@dw:type": "Country_Region_ID", "#text": "USA-NC"}, {"@dw:type": "ISO_3166-2_Code", "#text": "NC"}]}, "dw:region_subdivision_1": "Wake", "dw:postal_code": "27513", "dw:Country_ISO_Code": "USA", "dw:address_type": {"@dw:Descriptor": "Business", "dw:ID": [{"@dw:type": "WID", "#text": "4fae289a7fe541b098ca9448e462ff6b"}, {"@dw:type": "Communication_Usage_Type_ID", "#text": "BUSINESS"}]}, "dw:primary": "1"}

The query needs to accommodate both array and non-array json data to bring back address, city, state and country code for all rows of data for where the nested key #text has the value 'BUSINESS'.

5
  • The error message is telling you that "addresses" doesn't always look like that. You can use json_typeof(t.addresses::json) to find the ones that don't. Commented Aug 8, 2021 at 15:53
  • Thank you Adrian. The function json_typeof() returns the type of the outermost JSON value as a string. But I want to iterate through each json array element of the t.addresses::json. All addresses are staring with [ and ending with ]. Based on this we always have at least one element per row. Commented Aug 8, 2021 at 19:03
  • 1
    For the example you show us, json_typeof(t.addresses::json) is certainly not 'string'. It is 'array'. dbfiddle.uk/… Commented Aug 8, 2021 at 20:56
  • "Potgres" would probably be more relaxed, but Postgres is serious about errors. ;) Commented Aug 8, 2021 at 21:02
  • Thank you Jeff and Erwin, I am trying to combine a query that extracts BUSINESS address from both array and non array. Commented Aug 9, 2021 at 15:39

1 Answer 1

2

Like the error message tells you (and Jeff pointed out) there is one or more rows in the table dw_data_job, where the column addresses does not contain a valid JSON array (or NULL). A valid JSON literal, yes, or we'd see a different error message raised by the failed cast to ::json, but not a JSON array.

Run this query to identify violating rows:

SELECT id, addresses
     , jsonb_pretty(addresses::jsonb) AS js_pretty  -- optional
FROM   dw_data_job
WHERE  json_typeof(addresses::json) IS DISTINCT FROM 'array';

(Includes NULL values in addresses, which would not raise the reported error, but may be a problem, too.)

To just skip rows with invalid data and proceed with your query:

SELECT h ->> 'dw:address_line_1' AS address
     , h ->> 'dw:city' AS city
     , h #>> '{dw:region,dw:ID,2,#text}' AS state
     , h ->> 'dw:Country_ISO_Code' AS country
     , json_typeof(t.addresses::json)
FROM   testschema.dw_data_job t
JOIN   testschema.dw_data_demo t1 USING (id)
CROSS  JOIN json_array_elements(t.addresses::json) h
WHERE  json_typeof(t.addresses::json) = 'array'
AND    h #>> '{dw:address_type,dw:ID,1,#text}' = 'BUSINESS';

I simplified the syntax with the compact operator #> for deeply nested values, and return all text instead of some json (as educated guess).

To include plain objects as well as arrays, you might:

WITH sel AS (  -- compute json & type
   SELECT t.addresses::json AS a, json_typeof(t.addresses::json) AS js_type
   FROM   testschema.dw_data_job t
   JOIN   testschema.dw_data_demo t1 USING (id)
   )
 , obj AS (  -- unnest arrays and union with plain objects
   SELECT json_array_elements(a) AS h FROM sel WHERE js_type = 'array'
   UNION ALL
   SELECT a                           FROM sel WHERE js_type = 'object'
   -- all other types are ignored!
   )
SELECT h ->> 'dw:address_line_1' AS address
     , h ->> 'dw:city' AS city
     , h #>> '{dw:region,dw:ID,2,#text}' AS state
     , h ->> 'dw:Country_ISO_Code' AS country
FROM   obj
WHERE  h #>> '{dw:address_type,dw:ID,1,#text}' = 'BUSINESS';

db<>fiddle here

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you Erwin, now I see it. Given the scenario of both types, non array and array, how do I extract BUSINESS address in a one query? Thank you
@BalajiUppalapati: Please clarify that in the question. Make the objective clear, provide your version of Postgres and sample data to go with it - incl. an row with non-array data.
I added non array json data to original post. The query needs to accommodate both array and non array json data that brings back addressine1, city, state and country code for all rows of data for #text BUSINESS
I have checked json_typeof(t.addresses::json) = 'array' like this. It worked for me

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.