For example, I have two columns of arrays now:
id col1 col2
A [1, 3] [1, 2, 3]
B [2] [1, 2, 3]
what I want is all the elements in col2 but not in col1:
id output
A [2]
B [1, 3]
How can I achieve this?
Explode col2 array, use array_contains to check each element is in another array, collect array again for elements not in col1 array
select t.id,
collect_set(case when array_contains(t.col1, e.elem) then NULL else e.elem end) as result
from my_table t
lateral view explode(t.col2) e as elem
group by t.id
Pandasthis is one line job