13

I have updated the hive version from 0.20 to 0.13.1.

I'm using the following table and queries to extract the json from S3.

Table:

    > CREATE EXTERNAL TABLE in_app_logs (
    > event string,
    > app_id string,
    > idfa string,
    > idfv string
    > )ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY '\t'
    > LOCATION 's3://test/in_app_logs/ds=2015-04-20/'; 

My Query Looks likes the below for Version 0.20 and it's working fine with the old version.

    SELECT
       get_json_object(in_app_logs.event, '$.ev') as event_type,
       get_json_object(in_app_logs.event, '$.global.app_id') as app_id,
       get_json_object(in_app_logs.event, '$.global.ios.idfa') as idfa,
       get_json_object(in_app_logs.event, '$.global.ios.idfv') as idfv
    FROM in_app_logs;

In the new version it's changed to json_tuple. I have tried this query in updated version. Got Error.

SELECT b.event_type, c.app_id, d.idfa, d.idfv
FROM in_app_logs a
LATERAL VIEW json_tuple(a.event, 'ev') b as event_type,
LATERAL VIEW json_tuple(a.event.global, 'app_id') c as app_id,
LATERAL VIEW json_tuple(a.event.global.ios, 'idfa', 'idfv') d as idfa, idfv

S3 Logs:

   {
      "installed_at": "2015-04-17T12:10:24Z",
      "ev": "event_install",
      "global": {
        "ios": {
          "idfv": "887DF776-C1FC-4567-DESF-741AC72197D1",
          "time_zone": "EDT",
          "model": "iPhone7,2",
          "screen_size": "320x568",
          "carrier": "AT&T",
          "language": "en",
          "idfa": "CD04291C-0D80-4377-6CS9-B46089A05F15",
          "os_version": "8.2.0",
          "country": "US"
        }

Can anyone help me to extract the json data?

2
  • 1
    Do you have sample output from old version. Can you share the error/log you are getting? What is file format, Is it .json file. Commented Apr 26, 2015 at 18:02
  • What error? Pls add to post. Commented May 31, 2015 at 18:56

1 Answer 1

1

The '.' operator is only supported for structs or list of structs. You are trying to apply that on a STRING type.

You probably need something like this:

SELECT x.event_type, x.app_id, x.idfa, x.idfv
FROM in_app_logs a
LATERAL VIEW JSON_TUPLE(
    a.event,
   'ev',
   'global.app_id',
   'global.ios.idfa',
   'global.ios.idfv'
) x AS event_type, app_id, idfa, idfv
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.