2

I need to write one scenario in Spark using Scala API. I am passing a user defined function to a Dataframe which processes each row of data frame one by one and returns tuple(Row, Row). How can i change RDD ( Row, Row) to Dataframe (Row)? See below code sample -

**Calling map function-**
    val df_temp = df_outPut.map { x => AddUDF.add(x,date1,date2)}
**UDF definition.**
    def add(x: Row,dates: String*): (Row,Row) = {
......................
........................
    var result1,result2:Row = Row()
..........
    return (result1,result2)

Now df_temp is a RDD(Row1, Row2). my requirement is to make it one RDD or Dataframe by breaking tuple elements to 1 record of RDD or Dataframe RDD(Row). Appreciate your help.

1
  • 1
    How would you like the two Row elements to be combined? Should the columns from the second be appended to those of the first? Might there be common columns that exist in both rows? Question is unclear without this information. Commented Jun 9, 2016 at 18:51

1 Answer 1

3

You can use flatMap to flatten your Row tuples, say if we start from this example rdd:

rddExample.collect()
// res37: Array[(org.apache.spark.sql.Row, org.apache.spark.sql.Row)] = Array(([1,2],[3,4]), ([2,1],[4,2]))

val flatRdd = rddExample.flatMap{ case (x, y) => List(x, y) }
// flatRdd: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = MapPartitionsRDD[45] at flatMap at <console>:35

To convert it to data frame.

import org.apache.spark.sql.types.{StructType, StructField, IntegerType}

val schema = StructType(StructField("x", IntegerType, true)::
                        StructField("y", IntegerType, true)::Nil)    
val df = sqlContext.createDataFrame(flatRdd, schema)
df.show
+---+---+
|  x|  y|
+---+---+
|  1|  2|
|  3|  4|
|  2|  1|
|  4|  2|
+---+---+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.