0

This is a question identical to

Pyspark: Split multiple array columns into rows

but I want to know how to do it in scala

for a dataframe like this,

 +---+---------+---------+---+
 |  a|        b|        c|  d|
 +---+---------+---------+---+
 |  1|[1, 2, 3]|[, 8, 9] |foo|
 +---+---------+---------+---+

I want to have it in following format

+---+---+-------+------+
|  a|  b|  c    |    d |
+---+---+-------+------+
|  1|  1|  None |  foo |
|  1|  2|  8    |  foo |
|  1|  3|  9    |  foo |
+---+---+-------+------+

In scala, I know there's an explode function, but I don't think it's applicable here.

I tried

import org.apache.spark.sql.functions.arrays_zip

but I get an error, saying arrays_zip is not a member of org.apache.spark.sql.functions although it's clearly a function in https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html

3
  • None in an integer list is conflicting, anyway considered it as 0 and proceeded updated the answer. Commented Aug 12, 2020 at 18:25
  • @smart_coder how do you replace None with integer or some other string value in scala? Commented Aug 12, 2020 at 20:58
  • I am just wondering that how this seq [,8,9] list got generated?! Commented Aug 12, 2020 at 21:08

1 Answer 1

1

the below answer might be helpful to you,

import org.apache.spark.sql.types._
import org.apache.spark.sql.Row
import org.apache.spark.sql.functions._
 
val arrayData = Seq(
      Row(1,List(1,2,3),List(0,8,9),"foo"))
val arraySchema = new StructType().add("a",IntegerType).add("b", ArrayType(IntegerType)).add("c", ArrayType(IntegerType)).add("d",StringType)

val df = spark.createDataFrame(spark.sparkContext.parallelize(arrayData),arraySchema)

df.select($"a",$"d",explode($"b",$"c")).show(false)

val zip = udf((x: Seq[Int], y: Seq[Int]) => x.zip(y))

df.withColumn("vars", explode(zip($"b", $"c"))).select($"a", $"d",$"vars._1".alias("b"), $"vars._2".alias("c")).show()

/*
+---+---+---+---+
|  a|  d|  b|  c|
+---+---+---+---+
|  1|foo|  1|  0|
|  1|foo|  2|  8|
|  1|foo|  3|  9|
+---+---+---+---+
*/

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.