How to convert csv data to custom object in spark. Below are my code snippet
val sparkSession = SparkSession
.builder()
.appName("Spark SQL basic example")
.master("local[2]")
.getOrCreate()
val citiData = sparkSession.read.option("header", "true").option("inferSchema", "true").csv(filePath) // removing header,and applying schema
//citiData.describe().show()
import sparkSession.implicits._
val s: Dataset[CityData] = citiData.as[CityData]
}
//Date,Open,High,Low,Close,Volume
case class CityData(processingDate: java.util.Date, Open: Double, High: Double, Low: Double, Volume: Double)
Sample DataSet:
Date,Open,High,Low,Close,Volume
2006-01-03,490.0,493.8,481.1,492.9,1537660
2006-01-04,488.6,491.0,483.5,483.8,1871020
2006-01-05,484.4,487.8,484.0,486.2,1143160
2006-01-06,488.8,489.0,482.0,486.2,1370250
i have changed to case class CityData input param type to String , then it is causing "cannot resolve 'processingDate' given input columns: [Volume, Close, High, Date, Low, Open];" exception.
- How can i create custom object
- Another tricky here convert to Date object
How can i do ? please share your ideas.