I have 2 dataframes df1 and df2,
df1 has column Name with values like a,b,c etc
df2 has column Id with values like a,b
If Name column in df1 has a match in Id column in df2, then we need to have match status as 0. If there is no match then we need to have match status as 1.
I know that I can put df2 ID column in a collection using collect and then check if Name column in df1 has matching entry.
val df1 = Seq(“Rey”, “John”).toDF(“Name”)
val df2 = Seq(“Rey”).toDF(“Id”)
val collect = df2.select("Id").map(r => r.getString(0)).collect.toList
something like,
val df3 =
df1.withColumn("match_sts",when(df1("Name").isin(collect).then(0).else(1)))
Expected output
+ — — + — -+
|Name|match_sts|
+ — — + — -+
| Rey| 0 |
|John| 1 |
+ — — + — -+
But I don't want to use collect here. Is there any alternate approach available.