0
val schema = StructType(Array(StructField("id", IntegerType, false),StructField("num", IntegerType, false)))

I want to generate continuous number from 0 to num by every id。 I don't know how to do .. Thanks

data and result here !!!

1
  • Added code format Commented Apr 13, 2017 at 7:58

2 Answers 2

2

You can use UDF and explode function:

import org.apache.spark.sql.functions.{udf, explode}

val range = udf((i: Int) => (0 to i).toArray)
df.withColumn("num", explode(range($"num")))
Sign up to request clarification or add additional context in comments.

Comments

0

Try DataFrame.explode:

df.explode(col("id"), col("num")) {case row: Row =>
    val id = row(0).asInstanceOf[Int]
    val num = row(1).asInstanceOf[Int]
    (0 to num).map((id, _))
}

Or in RDD land, you can use flatmap for this:

df.rdd.flatMap(x => (0 to x._2).map((x._1, _)))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.