example: I have a pyspark dataframe as:
df=
x_data y_data
2.5 1.5
3.5 8.5
4.5 89.5
5.5 20.5
Let's say have some calculation to be done on each column on df which I do inside a for loop. After that my final output should be like this:
df_output=
cal_1 cal_2 Cal_3 Cal_4 Datatype
23 24 34 36 x_data
12 13 18 90 x_data
23 54 74 96 x_data
41 13 38 50 x_data
53 74 44 6 y_data
72 23 28 50 y_data
43 24 44 66 y_data
41 23 58 30 y_data
How do I append these results calculated on each column into the same pyspark output data frame inside the for loop?