How do I explode String in Spark dataframe

Question

I have a JSON string which is actually an array

|{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}|

I need to explode this so that I can have all keys as columns, something like this

dataFrame.
  .withColumn("single", explode_outer(col("nested")))

However, spark keeps complaining that explode should be map an array.

How do I do this?

what is the type of nested column? and the expected output? — blackbishop
– blackbishop, Commented Jan 27, 2022 at 17:40

blackbishop · Accepted Answer · 2022-01-28 00:20:13Z

0

You can parse the JSON string into MapType using from_json, then explode the map and pivot:

val df = Seq(
  (1,"""{"[0].id":"cccccccc","[0].label":"xxxxxx","[0].deviceTypeId":"xxxxxxxxxxxx"}""")
).toDF("id", "nested")

val df1 = (df
  .select(
    col("id"),
    explode(from_json(col("nested"), lit("map<string,string>")))
  )
  .groupBy("id")
  .pivot("key")
  .agg(first(col("value"))))

df1.show
//+---+----------------+--------+---------+
//| id|[0].deviceTypeId|  [0].id|[0].label|
//+---+----------------+--------+---------+
//|  1|    xxxxxxxxxxxx|cccccccc|   xxxxxx|
//+---+----------------+--------+---------+

answered Jan 28, 2022 at 0:20

blackbishop

32.8k11 gold badges61 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How do I explode String in Spark dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related