0

I've imported a csv into a table in BigQuery, and some columns hold a string which really should be a list of floats. So now i'm try to convert these strings to some arrays. Thanks to SO I've managed to convert a string to a list of floats, but now i get one row by element of the list, instead of 1 row by initial row ie the array is "unnested". But it's a problem as it would generate a huge amount of duplicated data. Would someone please now how to do the conversion from STRING to ARRAY<FLOAT64> please?

partial code:

with tbl as (
  select "id1" as id, 
  "10000\t10001\t10002\t10003\t10004" col1_str, 
  "10000\t10001.1\t10002\t10003\t10004" col2_str 
)
select id, cast(elem1 as float64) col1_floatarray, cast(elem2 as float64) col2_floatarray
from tbl
, unnest(split(col1_str, "\t")) elem1
, unnest(split(col2_str, "\t")) elem2

expected:
1 row, with 3 columns of types STRING id, ARRAY<FLOAT64> col1_floatarray, ARRAY<FLOAT64> col2_floatarray

Thank you!

1 Answer 1

1

Use below

select id, 
  array(select cast(elem as float64) from unnest(split(col1_str, "\t")) elem) col1_floatarray, 
  array(select cast(elem as float64) from unnest(split(col2_str, "\t")) elem) col2_floatarray
from tbl               

if applied to sample data in y our question - output is

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

and it works. thank you, again! obviously i'm a little confused about the positioning of unnest either in the select part or in the from part...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.