I have some json data like below, I need to create new columns based on the some Jason values
{ "start": "1234567679", "test": ["abc"], "value": 324, "end": "1234567689" }
{ "start": "1234567679", "test": ["xyz"], "value": "Near", "end": "1234567689"}
{ "start": "1234568679", "test": ["pqr"], "value": ["Attr"," "], "end":"1234568679"}
{ "start": "1234568997", "test": ["mno"], "value": ["{\"key\": \"1\", \"value\": [\"789\"]}" ], "end": "1234568999"}
above is the json example
I want to create a column like below
start abc xyz pqr mno end
1234567679 324 null null null 1234567689
1234567889 null Near null null 1234567989
1234568679 null null attr null 1234568679
1234568997 null null null 789 1234568999
def getValue1(s1: Seq[String], v: String) = {
if (s1(0)=="abc")) v else null
}
def getValue2(s1: Seq[String], v: String) = {
if (s1(0)=="xyz")) v else null
}
val df = spark.read.json("path to json")
val tdf = df.withColumn("abc",getValue1($"test", $"value")).withColumn("xyz",getValue2($"test", $"value"))
But this i dont want to use because my test values are more, I want some function do something like this
def getColumnname(s1: Seq[String]) = {
return s1(0)
}
val tdf = df.withColumn(getColumnname($"test"),$"value"))
is it good idea to change the values to columns, I want like this because I need to apply this on some Machine learning code which needs plain columns