0

Starting with a table like this

WITH table AS (
SELECT 1001 as ID, 1 As Color_blue, 0  AS Color_red,  0 AS Color_black UNION ALL
SELECT 1002 as ID, 0 As Color_blue, 0  AS Color_red,  1 AS Color_black UNION ALL
SELECT 1003 as ID, 0 As Color_blue, 1  AS Color_red,  0 AS Color_black UNION ALL
SELECT 1004 as ID, 0 As Color_blue, 0  AS Color_red,  1 AS Color_black )

SELECT *
FROM table

enter image description here

I want to aggregate all the features columns and build an array as following:

select ID, array<float64>[Color_blue, Color_red, Color_black] features
from table

enter image description here

However, I want to include all the columns names dynamically without hard coding them in my query. The output should stay the same. How do I create an array of floats with all the features, assuming I don't know the column names (only "ID")?

1 Answer 1

1

Below is for BigQuery Standard SQL

#standardSQL
SELECT id, 
  ARRAY(
    SELECT CAST(val AS FLOAT64) 
    FROM UNNEST(SPLIT(TRIM(FORMAT('%t', t), '()'))) val 
    WHERE CAST(id AS STRING) != val
  ) AS features
FROM `project.dataset.table` t   

when applied to sample data from your question - result is

Row id      features     
1   1001    1.0  
            0.0  
            0.0  
2   1002    0.0  
            0.0  
            1.0  
3   1003    0.0  
            1.0  
            0.0  
4   1004    0.0  
            0.0  
            1.0  
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.