-2

Is there any method, where i can add a json object to already existing json object array:

I have a dataframe:

+-------------------------+---------------------------------------------------------+------------+
|   name                  |       hit_songs                                         |  column1   |
+-------------------------+---------------------------------------------------------+------------+
|{"HomePhone":"34567002"} | [{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}] | value1     |
|{"HomePhone":"34567011"} | [{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}] |  value2    |
+-------------------------+---------------------------------------------------------+------------+ 

I want a resulting dataframe as:

+---------------------------------------------------------------------------------+------------+
|   name                                                                                column1  
+------------------------------------------------------------------------------------+------------+
|[ {"HomePhone":"34567002"},{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"} ] |  value1     |
|[ {"HomePhone":"34567011"},{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"} ] |   value2    |
+-------------------------+---------------------------------------------------------++------------+
1
  • 1
    What are you trying to achieve? Can you show what you need in dataframes because I assume ultimately you want data in table format and not json. Commented May 16, 2020 at 14:52

1 Answer 1

1

Use array_union function.

name is of type string, to convert this column to array type use array

Check below code.

scala> df.show(false)
+------------------------+-------------------------------------------------------+
|name                    |hit_songs                                              |
+------------------------+-------------------------------------------------------+
|{"HomePhone":"34567002"}|[{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|
|{"HomePhone":"34567011"}|[{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|
+------------------------+-------------------------------------------------------+


scala> df.withColumn("name",array_union(array($"name"),$"hit_songs")).show(false) // Use array_union function, to join name string column with hit_songs array column, first convert name to array(name).
+---------------------------------------------------------------------------------+-------------------------------------------------------+
|name                                                                             |hit_songs                                              |
+---------------------------------------------------------------------------------+-------------------------------------------------------+
|[{"HomePhone":"34567002"}, {"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|[{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|
|[{"HomePhone":"34567011"}, {"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|[{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|
+---------------------------------------------------------------------------------+-------------------------------------------------------+
scala> df.show(false)
+------------------------+-------------+-------------------------------------------------------+
|name                    |dammy        |hit_songs                                              |
+------------------------+-------------+-------------------------------------------------------+
|{"HomePhone":"34567002"}|{"aaa":"aaa"}|[{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|
|{"HomePhone":"34567011"}|{"bbb":"bbb"}|[{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|
+------------------------+-------------+-------------------------------------------------------+


scala> df.printSchema
root
 |-- name: string (nullable = true)
 |-- dammy: string (nullable = true)
 |-- hit_songs: array (nullable = true)
 |    |-- element: string (containsNull = true)


scala> df.withColumn("name",array_union(array_union(array($"name"),$"hit_songs"),array($"dammy"))).show(false)

+---------------------------------------------------------------------------------+-------------+-------------------------------------------------------+
|name                                                                             |dammy        |hit_songs                                              |
+---------------------------------------------------------------------------------+-------------+-------------------------------------------------------+
|[{"HomePhone":"34567002"}, {"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|{"aaa":"aaa"}|[{"Phonetypecode":"PTC001"},{"Phonetypecode":"PTC002"}]|
|[{"HomePhone":"34567011"}, {"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|{"bbb":"bbb"}|[{"Phonetypecode":"PTC021"},{"Phonetypecode":"PTC022"}]|
+---------------------------------------------------------------------------------+-------------+-------------------------------------------------------+

Sign up to request clarification or add additional context in comments.

21 Comments

what if we want to add as a third column?
Sir, actually i have more columns other than name and hit_songs.i want them to remain intact and this joined array column should be there.I have updated the question dataframe. please cheeck
ok, I have updated answer, use withColumn & use drop() function to drop not required columns.
let me check it
Throwing error: org.apache.spark.sql.AnalysisException cannot resolve 'array_union(array(entitymappingJoinA.phonestruct11), entitymappingJoinA.phonestruct11)' due to data type mismatch: input to function array_union should have been two arrays with same element type, but it's [array<struct<id:string,homephone:string,homephoneextension:string,workphone:string,workphoneextension:string,cellphone:string,cellphoneextension:string>>, struct<id:string,homephone:string,homephoneextension:string,workphone:string,workphoneextension:string,cellphone:string,cellphoneextension:string>];;
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.