2

The schema looks like this

root
|-- orderitemlist: array (nullable = true)
| |-- element: struct (containsNull = true)
| | |-- internal-material-code: string (nullable = true)
| | |-- lot-number: string (nullable = true)
| | |-- packaging-item-code: string (nullable = true)
| | |-- packaging-item-code-type: string (nullable = true)

how do I access the values for internal-material-code and lot-number

on creating the dataframe I do this

df.withColumn("internalmaterialcode", col("orderitemlist")(0).getItem("internal-material-code"))

also

df.withColumn("internalmaterialcode", col("orderitemlist")(0)("internal-material-code"))

also as follows

df.withColumn("orderitemlistarray", explode(col("orderitemlist"))) 
.withColumn("internalmaterialcode", col("orderitemlistarray").getItem("internal-material-code")) 

also as follows

df.withColumn("orderitemlistarray", explode(col("orderitemlist"))) 
.withColumn("internalmaterialcode", col("orderitemlistarray.internal-material-code")) 

but it gives out null

I have seen similar looking schemas on stackoverflow questions but none of the answers were useful for me. Could someone answer it or direct me to the correct place.

2
  • 1
    You're looking for explode. Give that a google, lots of info out there. Commented Oct 2, 2019 at 2:08
  • I tried it as follows .withColumn("orderitemlistarray", explode(col("orderitemlist"))) .withColumn("internalmaterialcode", col("orderitemlistarray").getItem("internal-material-code")) also as follows .withColumn("orderitemlistarray", explode(col("orderitemlist"))) .withColumn("internalmaterialcode", col("orderitemlistarray.internal-material-code")) Commented Oct 2, 2019 at 2:20

2 Answers 2

3

After explode, select the newly created column and it will gives all the data from struct fields.

Example:

val va="""{
    "orderitemlist": [{
        "internal-material-code": "123",
        "lot-number": "vv",
        "packaging-item-code": "pp",
        "packaging-item-code-type": "ll"
    },{
        "internal-material-code": "234",
        "lot-number": "vv",
        "packaging-item-code": "pp",
        "packaging-item-code-type": "ll"
    }]
}"""

val df=spark.read.json(Seq(va).toDS).toDF

df.withColumn("arr",explode(col("orderitemlist"))).select("arr.*").show()

Result:

+----------------------+----------+-------------------+------------------------+
|internal-material-code|lot-number|packaging-item-code|packaging-item-code-type|
+----------------------+----------+-------------------+------------------------+
|                   123|        vv|                 pp|                      ll|
|                   234|        vv|                 pp|                      ll|
+----------------------+----------+-------------------+------------------------+

Now you will get all the columns from struct inside array..!!

Sign up to request clarification or add additional context in comments.

2 Comments

I'm getting this error - org.apache.spark.sql.AnalysisException: cannot resolve 'explode(value)' due to data type mismatch: input to function explode should be array or map type, not struct. I tried this df.withColumn("arr",explode(col("value"))).select("arr.*").show() Can you please help me?
I just fixed it by converting it - df.withColumn("arr",explode(array(col("value")))).select("arr.*").show(false)
1

I went through the code block shared by you & it is working fine. Please go through my work here(as an extension to the earlier solution):

>>>df.withColumn("ves", $"orderitemlist.lot-number").show
+--------------------+--------+
|       orderitemlist|     ves|
+--------------------+--------+
|[[123, vv, pp, ll...|[vv, vv]|
+--------------------+--------+

>>>df.withColumn("vew", $"orderitemlist".getItem("lot-number")).show
+--------------------+--------+
|       orderitemlist|     vew|
+--------------------+--------+
|[[123, vv, pp, ll...|[vv, vv]|
+--------------------+--------+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.