0

I am creating a function which takes join keys and condition as parameters and dynamically joins two dataframes.

I understand Spark Scala Dataframe join done the following ways:

1) join(right: Dataset[_]): DataFrame
2) join(right: Dataset[_], usingColumn: String): DataFrame
3) join(right: Dataset[_], usingColumns: Seq[String]): DataFrame
4) join(right: Dataset[_], usingColumns: Seq[String], joinType: String): DataFrame
5) join(right: Dataset[_], joinExprs: Column): DataFrame
6) join(right: Dataset[_], joinExprs: Column, joinType: String): DataFrame

Join keys/usingColumns parameters will be a list of column names. condition/joinExprs - not sure how to pass it, but it can be a string like "df2(colname) == 'xyz'"

Based on this post, I came up with the below. It takes care of join keys list, but how can I add the conditon as well? (note: I used identical dataframes here for simplicity)

 %scala
  val emp = Seq((1,"Smith",-1,"2018","10","M",3000),
    (2,"Rose",1,"2010","20","M",4000),
  )
  val empColumns = Seq("emp_id","name","superior_emp_id","year_joined","dept_id","gender","salary")
  import spark.sqlContext.implicits._
  val empDF = emp.toDF(empColumns:_*) 
  val empDF2 = emp.toDF(empColumns:_*) 


val join_keys = Seq("emp_id","name") // this will be a parameter
val joinExprs = join_keys.map{case (c1) => empDF(c1) === empDF2(c1)}.reduce(_ && _) 

// How do I add to joinExprs, another joinExpr like "empDF2(dept_id) == 10" here?

empDF.join(empDF2,joinExprs,"inner").show(false)

1 Answer 1

2

You can just append to joinExprs with &&:

empDF.join(empDF2,joinExprs && empDF2("dept_id") === 10,"inner").show(false)
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Raphael. Will try that. What if I want to send the joinExprs && "empDF2("dept_id") === 10" as a parameter, what type would that be.
it should be of type Column
so will it be like val joinExprCol = empDF2("dept_id") === 10 looks a little strange

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.