My objective is to fetch a column values into a variable if possible as a list from pyspark dataframe.
Expected output = ["a", "b", "c", ... ]
I tried :
[
col.__getitem__("x")
for col in data.select("x").collect()
]
But it gives list of Row objects.
Output : [Row(x='a'), Row(x='b'), Row(x='c'), ...]
I don't want to use collect as well as don't need Row objects.
tried another method :
data.select(f.collect_list("x")).collect()
slightly better then earlier version but gets:
Output = [Row(collect_list(x) = ['a', 'b', 'c', ...]]
Thanks in advance and Happy new year!