2

I am trying to sum a list of columns in my Dataframe of type org.apache.spark.sql.DataFrame and create a new column 'sums' and dataframe 'out'.

I can do this quite easily if I list the columns by hand, for example, this works

val columnsToSum = List(col("led zeppelin"), col("lenny kravitz"), col("leona lewis"), col("lily allen"))
val out = df3.withColumn("sums", columnsToSum.reduce(_ + _))

However, if I wish to do this by pulling the column names directly from the dataframes the items in the list object are not the same and I am unable to do this, for example

val columnsToSum = df2.schema.fields.filter(f => f.dataType.isInstanceOf[StringType]).map(_.name).patch(0, Nil, 1).toList // arrays are mutable (remove "user" from list)
println(tmpArr)
>> List(a perfect circle, abba, ac/dc, adam green, aerosmith, afi, ...

// Trying the same method
val out = df3.withColumn("sums", columnsToSum.reduce(_ + _))

>> found   : String
 required: org.apache.spark.sql.Column
val out = df3.withColumn("sums", tmpArr.reduce(_ + _))found   : String
 required: org.apache.spark.sql.Column
val out = df3.withColumn("sums", tmpArr.reduce(_ + _))

How do I do this type of conversion? I've tried:

List(a perfect circle, abba, ac/dc, ...).map(_.Column)
List(a perfect circle, abba, ac/dc, ...).map(_.spark.sql.Column)
List(a perfect circle, abba, ac/dc, ...).map(_.org.apache.spark.sql.Column)

Which haven't worked Thanks in advance

1 Answer 1

2

You can get a column object from a string by using function col (you are actually already using it in your first snippet).

So this should work:

columnsToSum.map(col).reduce(_ + _)

or move verbose version:

columnsToSum.map(c => col(c)).reduce(_ + _)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.