2

I have a dataframe of schema -

|-- A: string (nullable = true)
|-- B: array (nullable = true)
|    |-- element: struct (containsNull = true)
|    |    |-- key: string (nullable = true)
|    |    |-- x: double (nullable = true)
|    |    |-- y: double (nullable = true)
|    |    |-- z: double (nullable = true)
|-- C: array (nullable = true)
|    |-- element: struct (containsNull = true)
|    |    |-- key: string (nullable = true)
|    |    |-- x: double (nullable = true)
|    |    |-- y: double (nullable = true)

I want to merge column B & C (array_union). But array_union is not working because of different data types of these columns. Structs of B & C have pretty much same columns except z. I don't care about z - whether it is present or not - in their merged output.

What would be a good way to achieve this?

2 Answers 2

2

Sure, drop Z in B and then array_join()

new = (df1.withColumn('B',expr("transform(B,s->struct(s.key as key,s.x as x, s.y as y))"))#drop Z
       .withColumn('D', array_union(col('B'),col('C')))#array_join
       .drop('B','C')#Drop B and C if not needed
      ).printSchema()

root
 |-- A: string (nullable = false)
 |-- D: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- key: string (nullable = true)
 |    |    |-- x: double (nullable = true)
 |    |    |-- y: double (nullable = true)
Sign up to request clarification or add additional context in comments.

1 Comment

thanks for the solution! However, I keep getting not found: value array_union even after import org.apache.spark.sql.functions._. Anything I'm missing?
1

Transform the column 'C' like this and use the array_union after:

import pyspark.sql.functions as f
df = (df
      .withColumn('z', f.expr("transform(C, element -> cast(1 AS double))"))
      .withColumn('C', f.expr("transform(C, (element, idx) -> struct(element_at(C.x, idx + 1) AS x, element_at(C.y, idx + 1) AS y, element_at(z, idx + 1) AS z))"))
      .drop('z')
     )

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.