1

I have table with jsons:

CREATE TABLE TABLE_JSON (
  json_body string
 );

Json has structure:

{ obj1: { fields ... },  obj2: [array] }

I want to select all elements from array, but I can't.

For example, I can get all fields from first object:

SELECT f.fields...
    FROM (
        SELECT q1.obj1, q1.obj2
        FROM TABLE_JSON jt
        LATERAL VIEW JSON_TUPLE(jt.json_body, 'obj1', 'obj2') q1 AS obj1, obj2
      ) as json_table2
    LATERAL VIEW JSON_TUPLE(TABLE_JSON.obj1, 'fields...') f AS fields...;

But with array this method doesnt work.

I've tried to use

...
    LATERAL VIEW explode(json_table2.obj2) adTable AS arr;

hive explode doc

But obj2 - string with array. How to transform string-json to array and explode it?

1
  • is the size of array fixed across rows? Commented May 29, 2018 at 10:57

3 Answers 3

3

The json_split UDF from Brickhouse ( http://github.com/klout/brickhouse ) can convert a JSON array to a Hive List, and then you can explode that.

See http://mail-archives.apache.org/mod_mbox/hive-user/201406.mbox/%3CCAO78EnLgSrrUY3Ad_ZWS9zWNKLQRwS9jXrqEE869FhUNiWgCXA@mail.gmail.com%3E and https://brickhouseconfessions.wordpress.com/2014/02/07/hive-and-json-made-simple/

Sign up to request clarification or add additional context in comments.

Comments

0

You can consider using Hive-JSON SerDe to read the data from JSON.

Refer: https://github.com/rcongiu/Hive-JSON-Serde

Comments

0

This may not be an optimal solution but can help unblock you. For a JSON object which looks like below

'{"obj1":"field1","obj2":["a1","a2","a3"]}'

this query can help you obtain all items of array into individual columns given that the size of the array is constant across all rows.

    SELECT split(results,",")[0] AS arrayItem1,
       split(results,",")[1] AS arrayItem2,
       regexp_replace(split(results,",")[2], "[\\]|}]", "") AS arrayItem3
    FROM
       (SELECT split(translate(get_json_object(TABLE_JSON.json_body,'$.obj2'), '"\\[|]|\""',''), "},") AS r
       FROM TABLE_JSON) t1 LATERAL VIEW explode(r) rr AS results

It produces the result which looks like this

arrayitem1| arrayitem2| arrayitem3
a1        | a2        | a3

You can scale it to any number of array size on a condition that size is constant across the table.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.