I have a DataFrame with a column named KFA containing a string with angular braces on both ends. There are 4 double values in this long string. I would like to convert this into a DataFrame with vectors.
This is the first element of the DataFrame:
> dataFrame1.first()
res130: org.apache.spark.sql.Row = [[.00663 .00197 .29809 .0034]]
Could you help me to covert it into a dense vector with 4 double values.
I have tried this command
dataFrame1.select("KFA")
.map((x=>x.mkString("").replace("]","").replace("[","").split(" ")))
.rdd.map(x=>Vectors.dense(x(0).toDouble,x(1).toDouble,x(2).toDouble,x(3).toDouble,x(4).toDouble))
This looks very clumsy and unreadable. Could you suggest any other ways of doing this?
mkStringif you are just going to split it?mkStringbecause I couldn't use.replace("]","")on a spark.sql.RowgetAs[Double]from a Row object