How can you extract the elements of a nested array of a JSON using Java Spark

Question

Here is my JSON file content.

{
  "Id": 11,
  "data": [
    {
      "package": "com.browser1",
      "activetime": 60000,
      "steps": [
        {"x":  1, "y":  2},
        {"x": 11, "y": 12}
      ]
    },
    {
      "package": "com.browser6",
      "activetime": 1205000,
      "steps": [
        {"x":  3, "y":  4}
      ]
    },
    {
      "package": "com.browser7",
      "activetime": 1205000,
      "steps": [
        {"x":  5, "y":  6}
      ]
    }
  ]
}

I am reading this json file from spark using java. How can I get the value of the following:-

json.data[0].steps[0].x
and 
json.data[0].steps[1].x

I am using DataSet.select to do this.

df.select(json.data[0].steps[1].x)

did not work.

P.S:- Need java solution and not Scala

see sparkbyexamples.com/spark/spark-dataframe-nested-array

Maksim Bezmen
– Maksim Bezmen

2022-08-07 12:00:01 +00:00
Commented Aug 7, 2022 at 12:00 — Maksim Bezmen
– Maksim Bezmen, Commented Aug 7, 2022 at 12:00

vilalabinot · Accepted Answer · 2022-08-07 12:16:46Z

1

If you read the JSON file like this:

sparkSession.read.option("multiline", true).json("./data.json")

Then you can access your desired values as below:

.withColumn("test", col("data").getItem(0).getField("steps").getItem(1).getField("x"))

or

.withColumn("test2", expr("data[0].steps[1].x"))

They do the same thing, whichever you prefer, the return value is 11.

Good luck!

answered Aug 7, 2022 at 12:16

vilalabinot

1,6216 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How can you extract the elements of a nested array of a JSON using Java Spark

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related