I am new to both Scala and Spark. I am trying to transform an input read from files as Double into Float (which is safe in this application) so as to reduce the memory usage. I have been able to do that with a column of Double.
Current approach for a single element:
import org.apache.spark.sql.functions.{col, udf}
val tcast = udf((s: Double) => s.toFloat)
val myDF = Seq(
(1.0, Array(0.1, 2.1, 1.2)),
(8.0, Array(1.1, 2.1, 3.2)),
(9.0, Array(1.1, 1.1, 2.2))
).toDF("time", "crds")
myDF.withColumn("timeF", tcast(col("time"))).drop("time").withColumnRenamed("timeF", "time").show
myDF.withColumn("timeF", tcast(col("time"))).drop("time").withColumnRenamed("timeF", "time").schema
But currently stuck with transforming array of doubles to floats. Any help would be appreciated.