I have json file that contains json objects, each object by line. I have the folowing schema for these objects :
root
|-- endtime: long (nullable = true)
|-- result: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- hop: long (nullable = true)
| | |-- result: array (nullable = true)
| | | |-- element: struct (containsNull = true)
| | | | |-- from: string (nullable = true)
| | | | |-- rtt: double (nullable = true)
| | | | |-- size: long (nullable = true)
| | | | |-- ttl: long (nullable = true)
| | | | |-- x: string (nullable = true)
The question : How I can create a new DataFrame from Dataframe containing the data in the json file given as input and deleting data as ttl and x?
| | | | |-- ttl: long (nullable = true)
| | | | |-- x: string (nullable = true)
Given that I am new in Spark (Scala), I don't know what are the possile ways!
It was simple to delete endtime by :
val pathToTraceroutesExamples = getClass.getResource("/test/sample_1.json")
val df = spark.read.json(pathToTraceroutesExamples.getPath)
// Displays the content of the DataFrame to stdout
df.show()
df.printSchema()
var newDf = df.drop("endtime")