- Spark Version : 1.6.2
- Java Version: 7
I have a List<String> data. Something like:
[[dev, engg, 10000], [karthik, engg, 20000]..]
I know schema for this data.
name (String)
degree (String)
salary (Integer)
I tried:
JavaRDD<String> data = new JavaSparkContext(sc).parallelize(datas);
DataFrame df = sqlContext.read().json(data);
df.printSchema();
df.show(false);
Output:
root
|-- _corrupt_record: string (nullable = true)
+-----------------------------+
|_corrupt_record |
+-----------------------------+
|[dev, engg, 10000] |
|[karthik, engg, 20000] |
+-----------------------------+
Because List<String> is not a proper JSON.
Do I need to create a proper JSON or is there any other way to do this?