1

I have table with json string

UserID  json_string
100      [{"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08- 
           16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
100      [{"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
100      [{"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]
200      [{"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}]
200      [{"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-08- 
          16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}]
200      [{"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}]

Finally, I need to convert string to columns:

UserID  ID         value    os_type   amount    created_at                  updated_at                  Type_name
100    77379513    35.4566  null    200    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:00   same
100    77379514    38.658   null    100    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:01   niko
100    77379515    40.569   null    150    2020-08-16T14:48:27.611-04:00    2020-08-16T14:48:27.611-04:02   koko
200    77378899   25.365    null    100    2020-09-16T14:48:27.611-04:01    2020-08-17T14:48:27.611-04:03   same
200    77378900   35.898    null    500    2020-09-16T14:48:27.611-04:02    2020-08-17T14:48:27.611-04:04   niko
200    77378901   41.258    null    400    2020-09-16T14:48:27.611-04:03    2020-08-17T14:48:27.611-04:05   koko

First I try to extract JSON from the list:

SELECT iUserID,json_extract_array(json_string) as json_array
FROM `project.dataset.table` 

Then I get a table like that:

UserID                              json_array
100     {"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08- 
         16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same'}
100     {"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko'}
100     {"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko'}
200     {"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "same'}
200     {"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "niko'}
200     {"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-09- 
        16T14:48:27.611-04:00", "updated_at": "2020-08-17T14:48:27.836-04:00", "Type_name": "koko'}

From this step, I try to use a function JSON_EXTRACT_SCALAR, but I get an error that this function does not work with the array. So what is the correct way to extract data to columns?

2
  • Out of curiosity, why store this data in JSON at all? It looks like every entry has the same fields. Why not just make a table with real columns named the same as those fields? Commented Mar 14, 2021 at 15:56
  • By the way, I wondered why the lines alternate in color due to the syntax highlighting, and I noticed you used ' in one place instead of ". Keep in mind that these quote characters are not interchangeable in JSON. You must use " consistently. Commented Mar 14, 2021 at 15:57

1 Answer 1

1

Below will work for you

select UserID, 
  json_extract_scalar(json, '$.id') as id,
  json_extract_scalar(json, '$.value') as value,
  json_extract_scalar(json, '$.os_type') as os_type,
  json_extract_scalar(json, '$.amount') as amount,
  json_extract_scalar(json, '$.created_at') as created_at,
  json_extract_scalar(json, '$.updated_at') as updated_at,
  json_extract_scalar(json, '$.Type_name') as Type_name
from `project.dataset.table`,
unnest(json_extract_array(json_string, '$')) json       

If apply to sample data in your question

with `project.dataset.table` as (
  select 100 UserID, '[{"id": 77379513, "value": "35.4566", "os_type": null, "amount": "200", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same"}]' json_string union all
  select 100, '[{"id": 77379514, "value": "38.658", "os_type": null, "amount": "100", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko"}]' union all
  select 100, '[{"id": 77379515, "value": "40.569", "os_type": null, "amount": "150", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko"}]' union all
  select 200, '[{"id": 77378899, "value": "25.365", "os_type": null, "amount": "100", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "same"}]' union all
  select 200, '[{"id": 77378900, "value": "35.898", "os_type": null, "amount": "500", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "niko"}]' union all
  select 200, '[{"id": 77378901, "value": "41.258", "os_type": null, "amount": "400", "created_at": "2020-08-16T14:48:27.611-04:00", "updated_at": "2020-08-16T14:48:27.836-04:00", "Type_name": "koko"}]' 
)

output is

enter image description here

note: you used ' in few places instead of " so this is "fixed" in above used sample data

In case if you do not have control over value in this table and cannot fix ' to " you can use below instead

select UserID, 
  json_extract_scalar(json, '$.id') as id,
  json_extract_scalar(json, '$.value') as value,
  json_extract_scalar(json, '$.os_type') as os_type,
  json_extract_scalar(json, '$.amount') as amount,
  json_extract_scalar(json, '$.created_at') as created_at,
  json_extract_scalar(json, '$.updated_at') as updated_at,
  json_extract_scalar(json, '$.Type_name') as Type_name
from `project.dataset.table`,
unnest(json_extract_array(replace(json_string, "'", '"'), '$')) json 

note change inside unnest which takes care of that issue with '

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.