Parse JSON column then add new column in the same dataframe with parsed value

Question

Here is an example of what I'm trying to accomplish:

Case:

column1	column2	json_column
One	Two	{'A': '1', 'B': '2', 'C': '3'}

Desired output:

column1	column2	json_column	B
One	Two	{'A': '1', 'B': '2', 'C': '3'}	2

As seen here, json_column has been parsed and a new column 'B' has been created containing the value of key 'B' in the json_column.

ZygD · Accepted Answer · 2022-06-28 14:02:18Z

2

If the column is of string type, you could use from_json:

F.from_json('json_column', 'struct<A:string,B:string,C:string>')['B']

Full example:

from pyspark.sql import functions as F
df = spark.createDataFrame([('One', 'Two', "{'A': '1', 'B': '2', 'C': '3'}")], ['column1', 'column2', 'json_column'])

df = df.withColumn('B', F.from_json('json_column', 'struct<A:string,B:string,C:string>')['B'])

df.show(truncate=0)
# +-------+-------+------------------------------+---+
# |column1|column2|json_column                   |B  |
# +-------+-------+------------------------------+---+
# |One    |Two    |{'A': '1', 'B': '2', 'C': '3'}|2  |
# +-------+-------+------------------------------+---+

answered Jun 28, 2022 at 14:02

ZygD

24.8k41 gold badges106 silver badges144 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Parse JSON column then add new column in the same dataframe with parsed value

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related