I am posting this question after searching a lot on the web but couldn't find the answer. I have a JSONArray in below format
[
{
"firstName":"John",
"lastName":"Doe",
"deparment" : {
"DeptCode":"10",
"deptName" : "HR"
}
},
{
"firstName":"Mel",
"lastName":"Gibson",
"deparment" : {
"DeptCode":"20",
"deptName" : "IT"
}
}
]
The JSONArray is from org.json.simple.JSONArray package. I am trying to convert this into Java Spark Dataframe. I was trying with the below code :
SparkConf conf = new SparkConf().setAppName("linecount").setMaster("local[*]");
SparkSession session = SparkSession.builder().config(conf).getOrCreate();
Dataset<Row> dataset = session.read().json(array.toString());
But no luck. I am facing below error. Also I can see in scala we can convert it to Dataframe using DS method. has someone tried this before ?
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: [{"firstName":%22John%22,%22lastName%22:%22Doe%22%7D,%7B%22firstName%22:%22Mel%22,%22lastName%22:%22Gibson%22%7D%5D
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.<init>(Path.java:172)
at org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:615)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$14.apply(DataSource.scala:350)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:349)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:333)
at org.apache.spark.sql.DataFrameReader.json(DataFrameReader.scala:279)
at com.vikas.rawat.AnotherMainClass.main(AnotherMainClass.java:34)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: [{"firstName":%22John%22,%22lastName%22:%22Doe%22%7D,%7B%22firstName%22:%22Mel%22,%22lastName%22:%22Gibson%22%7D%5D
at java.net.URI.checkPath(Unknown Source)
at java.net.URI.<init>(Unknown Source)
at org.apache.hadoop.fs.Path.initialize(Path.java:203)
... 14 more
json(array.toString())doesn't work. Thejsonmethod expects its argument to be a string that is a path to a file in the file system.