Pyspark explode list creating column with index in list

Question

So I have a question regarding pyspark. I have a dataframe that looks like this:

+---+------------+
| id|        list|
+---+------------+
|  2|[3, 5, 4, 2]|
+---+------------+
|  3|[4, 5, 3, 2]|
+---+------------+

And I would like to explode lists it into multiple rows and keeping information about which position did each element of the list had in a separate column. The result should look like this:

+---+------------+------------+
| id|    listitem|        rank|
+---+------------+------------+
|  2|           3|           1|
+---+------------+------------+
|  2|           5|           2|
+---+------------+------------+
|  2|           4|           3|
+---+------------+------------+
|  2|           2|           4|
+---+------------+------------+
|  3|           4|           1|
+---+------------+------------+
|  3|           5|           2|
+---+------------+------------+
|  3|           3|           3|
+---+------------+------------+
|  3|           2|           4|
+---+------------+------------+

The rank column has the index+1 of the position each element had in the list. Any suggestions on the most optimal code to achieve it?

Mohana B C · Accepted Answer · 2021-09-13 12:07:52Z

5

You can use posexplode() or posexplode_outer() function to get desired result.

df = spark.createDataFrame([(2, [3, 5, 4, 2]), (3, [4, 5, 3, 2])], ["id", "list"])

df.select('id',posexplode_outer('list').alias('rank', 'listitem')) \
.withColumn('rank', col('rank') + 1).show()

+---+----+--------+
| id|rank|listitem|
+---+----+--------+
|  2|   1|       3|
|  2|   2|       5|
|  2|   3|       4|
|  2|   4|       2|
|  3|   1|       4|
|  3|   2|       5|
|  3|   3|       3|
|  3|   4|       2|
+---+----+--------+

answered Sep 13, 2021 at 12:07

Mohana B C

5,4721 gold badge13 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pyspark explode list creating column with index in list

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related