I need to create an array of struct in Hive so that for one ID, I can put multiple structs in an array of struct.
I have created below table
CREATE TABLE if not exists tbl1
(
sess_id STRING
, source_start_ts STRING
, source_end_ts STRING
,Node_String STRUCT < Node: STRING, Time_Spent:STRING,
Txn_Type: STRING, Txn_Status: STRING, Call_Status: STRING >
)
STORED AS ORC
and then created second table which has array of above struct (I could have array of struct on the first table, but I tried that too and failed)
CREATE TABLE if not exists tbl2
(
sess_id STRING
,Col2 Array<STRUCT < Node: STRING,
Time_Spent:STRING, Txn_Type: STRING, Txn_Status: STRING,
Call_Status: STRING >>
) STORED AS ORC
However, when using below collect_set to populate it, I get an error
insert into table tbl2
select sess_id
, collect_set(Node_String) as Col2
from tbl1
where sess_id = 'abc'
group by sess_id
here is the error
SQL Error [40000] [42000]: Error while compiling statement: FAILED: UDFArgumentTypeException Only primitive type arguments are accepted but struct was passed as parameter 1.
I guess collect_set does not accept struct type. Is there any function that does this?
Here is an example
id, source_start_dt, source_end_dt, Node_string
1,'2019-01-01','2019-01-02' , {"node1","10s","activation", "123", "failed"}
1,'2019-01-01','2019-01-02', {"node2","120s","activation", "123", "Logged"}
1,'2019-01-01','2019-01-02', {"node3","450s","activation", "123", "completed"}
As you can see above, there are multiple Node_String with different struct fields for each ID. ID '1' has 3 rows and in order to roll those three row up in one, I used collect_set
Thanks
collect_set?