2

I have 2 dataframes that I would like to join.

DF1:

root
 |-- myStruct: struct (nullable = true)
 |    |-- id: string (nullable = true)
 |    |-- region: long (nullable = true)
 |-- first_name: string (nullable = true)

DF2:

root
 |-- id: string (nullable = true)
 |-- region: long (nullable = true)
 |-- second_name: string (nullable = true)

My join statement is

df1.join(df2, Seq("id", "region"), "leftouter")

but that fails with

USING column `id` cannot be resolved on the left side of the join. The left-side columns: myStruct, first_name

I am running Spark 2.2 on Scala

2 Answers 2

5

You can use . notation to select an element from struct column. so to select id from df1 you will have to do myStruct.id and to select region you have to use myStruct.region.

And since the column names to be used are not same you can use === notation for comparison as

df1.join(df2, df1("myStruct.id") === df2("id") && df1("myStruct.region") === df2("region"), "leftouter")

You should have the joined dataframe with following schema

root
 |-- myStruct: struct (nullable = true)
 |    |-- id: string (nullable = true)
 |    |-- region: long (nullable = false)
 |-- first_name: string (nullable = true)
 |-- id: string (nullable = true)
 |-- region: integer (nullable = true)
 |-- second_name: string (nullable = true)

You can drop the unnecessary columns after join or select only needed columns after join

I hope the answer is helpful

Sign up to request clarification or add additional context in comments.

Comments

2

This is because in DF1, id is an element of column myStruct which is of type struct. In order to join you can do something like,

val df = df1
.withColumn("id", col("myStruct.id"))
.withColumn("region", col("myStruct.region"))

df.join(df2, Seq("id", "region"), "leftouter")

This essentially extracts id and region from the struct column.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.