I have a pyspark dataframe and want to add a column that adds values from a list in a repeating fashion. If this were just python, I would probably use itertools' cycle function. I don't know how to do this in pyspark.
names = ['Julia', 'Tim', 'Zoe']
My dataframe looks like this:
+-----+------+
| id_A| idx_B|
+-----+------+
| a| 0|
| b| 0|
| b| 2|
| b| 2|
| b| 2|
| b| 2|
+-----+------+
I want it to look like this:
+-----+------+--------+
| id_A| idx_B| names |
+-----+------+--------+
| a| 0| Julia|
| b| 0| Tim|
| b| 2| Zoe|
| b| 2| Julia|
| b| 2| Tim|
| b| 2| Zoe|
+-----+------+--------+