Actually with the addition of stored procedures to Redshift this is possible
The procedure below accepts two parameters (source_table and target_table)
assuming both table exists it transforms the data described in the question
The way it works is
- Reads data from source table row by row
- Finds out max items in Items column
- In a loop extracts each item
- Inserts id + item combo into target table
CREATE OR REPLACE PROCEDURE Array_to_Rows(source_table VARCHAR, target_table VARCHAR)
LANGUAGE plpgsql
AS $$
DECLARE i INTEGER;
rec RECORD;
query VARCHAR;
item VARCHAR;
cnt INTEGER;
BEGIN
query := 'SELECT * FROM ' || source_table;
FOR rec IN EXECUTE query
LOOP
select INTO cnt regexp_count(rec.items,',')+1;
i := 1;
<< items_loop >>
LOOP
SELECT INTO item trim(split_part(rec.items,',',i));
EXECUTE 'INSERT INTO ' || target_table || ' values (' || rec.Empid || ',''' || item ||''')';
i := i + 1;
EXIT items_loop WHEN (i > cnt);
END LOOP;
END LOOP;
END;
$$
Usage: CALL Array_to_Rows('source table name','target table name')
With test data in the question it took less than 0.2 seconds, don't know how big OPs data set is
Output is
Empid item
1001 A
1001 B
1002 B
1003 C
1003 D
1003 E