1

Let's assume that I have a table containing four columns

  ID_list               array(int)
  name_list             array(string)
  profile_scores        map(int, array(map(int, float)))
  profile_scores2       map(int, array(array(int)))

For instance

ID_list      name_list       profile_scores               profile_scores2
[1, 2]       ['A', 'B']      {11: [{0:10.5}, {1:10.2}]}      {7: [[4, 5], [7, 8, 9]]}

the key of profile_scores, profile_scores2 columns are unique. There's only one key for those columns. However, the array contains as many elements as ID_list.

Basically in the above case, the below four are equal

   Length(ID_list)
   Length(name_list)
   Length(profile_scores[11])
   Length(profile_scores2[7])

I want to split it, so that the each element of ID_list is a separate row. For instance, in the above case

 ID_list      name_list       profile_scores               profile_scores2
 1             'A'            {11: {0:10.5}}               {7: [4, 5]}
 2             'B'            {11: {1:10.2}}               {7: [7, 8, 9]}

so the new data types would be

  ID_list               int
  name_list             string
  profile_scores        map(int, map(int, float))
  profile_scores2       map(int, array(int))

I feel like CROSS JOIN / UNNEST could work, but I'm not sure how to preserve the key for profile_scores, profile_scores2 columns and just split the array in the value.

Is this operation doable in SQL? If so, how should I apply CROSS JOIN / UNNEST (or some other operations) to complete this?

3
  • tried CROSS JOIN / UNNEST . feel like i'm missing something Commented May 30, 2024 at 4:24
  • What database/query engine are you using? Commented May 30, 2024 at 4:33
  • @GuruStron presto Commented May 30, 2024 at 4:34

1 Answer 1

2

Presto/Trino allows passing multiple columns to the unnest:

UNNEST can also be used with multiple arguments, in which case they are expanded into multiple columns, with as many rows as the highest cardinality argument (the other columns are padded with nulls)

select id, name, profile_scores, profile_scores2
from table
   , unnest(id_list, name_list) as t(id, name);   

which basically result in the unnested columns flattened based on the index (what seems exactly what you need).

UPD

Presto allows also flattening maps into key-value pairs (tested with Trino):

select id, name, k, v, k1, v1
from table
   , unnest(id_list, name_list, profile_scores, profile_scores2) as t(id, name, k, v, k1, v1);   
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. I think this probably works, but it keeps the data type for profile_scores, profile_scores2 as map(int, array(map(int, float))), map(int, array(array(int))) instead of map(int, map(int, float)), map(int, array(int)) is there a way to change this?
@user98235 from your question I understood that this is what is needed
I was wondering if there's a way to change data types as well. (in the question, I was asking the new data types to ID_list int name_list string profile_scores map(int, map(int, float)) profile_scores2 map(int, array(int))
@user98235 please see the update, but you need to provide actual desired output for your sample data (current one is achieved with the original query).
thanks for the answer. I actually have a follow up question - stackoverflow.com/questions/78553217/… could you also take a look? thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.