1

Hi I want to add new column using existing column in each row of DataFrame , I am trying this in Spark Scala like this ... df is dataframe containing variable number of column , which can be decided at run time only.

// Added new column "docid"
val df_new = appContext.sparkSession.sqlContext.createDataFrame(df.rdd, df.schema.add("docid", DataTypes.StringType))

 df_new.map(x => {
        import appContext.sparkSession.implicits._
      val allVals = (0 to x.size).map(x.get(_)).toSeq
      val values = allVals ++ allVals.mkString("_") 
      Row.fromSeq(values)
    }) 

But this is giving error is eclipse itself

  • Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
  • not enough arguments for method map: (implicit evidence$7: org.apache.spark.sql.Encoder[org.apache.spark.sql.Row])org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]. Unspecified value parameter evidence$7.

Please help.

3
  • The import should be done outside of the map. Commented Oct 9, 2017 at 9:53
  • 1
    Can you give example of input data and expected output? This should be possible to solve in an more efficient way. Commented Oct 9, 2017 at 9:59
  • Any reason why you can't use df.withColumn() as suggested in werner's answer? That would be the most straightforward solution Commented Dec 16, 2020 at 11:23

1 Answer 1

1

concat_ws from the functions object can help.

This code adds the docid field

df = df.withColumn("docid", concat_ws("_", df.columns.map(df.col(_)):_*))

assuming all columns of df are strings.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.