0

I have a hive table like below. The number of elements of each column is unpredictable. Can anyone tell me how to explode all columns of this table correctly without loosing the Null values.

+-------------------------------+-------------------+----------------------------+--+
| l1.skillcode                  | l1.duration       |     l1.numberofpeople      |
+-------------------------------+-------------------+----------------------------+--+
| ["ACFC"]                      | ["00020"]         | ["1"]                      |
| ["ACFC"]                      | ["00233"]         | ["1"]                      |
| ["AJBS"]                      | ["00605"]         | ["1"]                      |
| ["ACFC"]                      | ["00020"]         | ["1"]                      |
| ["TESTING"]                   | ["123456"]        | ["09876"]                  |
| ["ACFC"]                      | ["00233","846"]   | ["1"]                      |
| ["AJBS"]                      | ["00605"]         | ["1"]                      |
| ["ACFC"]                      | ["00020"]         | ["1"]                      |
| ["TESTING"]                   | NULL              | ["09876"]                  |
| ["ACFC"]                      | ["00233"]         | NULL                       |
| ["AJBS"]                      | ["00605"]         | ["1"]                      |
| ["ACFC"]                      | ["00020"]         | ["1"]                      |
| ["TESTING"]                   | NULL              | ["09876","09877","09878"]  |
| NULL                          | ["56743"]         | ["45678","345"]            |
| ["ACFC","BES","SAL","EPD"]    | ["00233"]         | ["1"]                      |
| ["AJBS"]                      | ["00605"]         | ["1"]                      |
| NULL                          | ["00020"]         | ["1"]                      |
| ["TESTING"]                   | NULL              | ["09876","09877","09878"]  |
| NULL                          | ["56743"]         | ["45678","345"]            |
| ["ACFC"]                      | ["00020"]         | ["1"]                      |
| ["TESTING"]                   | NULL              | ["09876","09877","09878"]  |
| ["ACFC"]                      | ["00233"]         | ["1"]                      |
| ["AJBS"]                      | ["00605"]         | ["1"]                      |
+-------------------------------+-------------------+----------------------------+--+

When i try below, i get the null values of the column that i am trying to explode and the associated non-null values of that row removed.

select L2.*,t1.duration,t1.numberofpeople from t1 
lateral view explode(t1.skillcode) L2;

How to explode all columns of the table without loosing any NULL values and also maintain the relationship between the values of all the 3 columns.

1 Answer 1

1

use lateral view outer instead of lateral view

hive> select * from L2;
OK
["BES","SAL"]   ["00020","846"] ["1","09876"]
["SEAL"]    []  []
[]  ["0020","0021"] []
Time taken: 0.088 seconds, Fetched: 3 row(s)
hive> select L3.*,L4.*,L5.* from L2  lateral view outer explode(L2.skillcode) L3 lateral view outer explode(L2.duration) L4 lateral view outer explode(L2.numberofpeople) L5;
OK
BES 00020   1
BES 00020   09876
BES 846 1
BES 846 09876
SAL 00020   1
SAL 00020   09876
SAL 846 1
SAL 846 09876
SEAL    NULL    NULL
NULL    0020    NULL
NULL    0021    NULL
Time taken: 0.119 seconds, Fetched: 11 row(s)

Note: manually create &insert data to array column type in hive.

CREATE TABLE `L2`(
  `skillcode` array<string>, 
  `duration` array<string>, 
  `numberofpeople` array<string>)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://localhost:9000/stackoverflow/data/hive/dwh/l2';
-- to insert the data
INSERT INTO L2  select array() as skillcode,array('0020','0021') as duration,array() as numberofpeople FROM (select '1' ) t;

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.