I have two dataframes, in which I want to know whether they are different based on column as a key otherwise will update the one that is different so they become equal
val TMP_SITE = spark.load("jdbc", Map("url" -> "jdbc:oracle:thin:System/maher@//localhost:1521/XE", "dbtable" -> "IPTECH.TMP_SITE"))
.withColumn("SITE",'SITE.cast(LongType))
val local_pos = spark.load("jdbc", Map("url" -> url, "dbtable" -> "pos")).select("id","name")
TMP_SITE.printSchema()
local_pos.printSchema()
val join = TMP_SITE.join(local_pos, 'SITE === 'id, "inner")
root
|-- SITE: long (nullable = true)
|-- LIBELLE: string (nullable = false)
root
|-- id: long (nullable = false)
|-- name: string (nullable = true)
the result of joining is
|id |name |SITE|LIBELLE |
+---+----------------------+----+----------------------+
|51 |Ezzahra |51 |Ezzahra |
|7 |BENIKHALLED |7 |BENIKHALLED |
|15 |Kram |15 |Kram |
|54 |El Mourouj |54 |El Mourouj |
|11 |LE BARDO |11 |LE BARDO |
|29 |Mini M Ksar said |29 |Mini M Ksar said |
|69 |ZAGHOUAN |69 |ZAGHOUAN |
|42 |BEB EL KHADHRA |42 |BEB EL KHADHRA |
|73 |Zaouit Kontech |73 |Zaouit Kontech |
|87 |Aouina |87 |Aouina |
|64 |Sousse I I |64 |Sousse I I |
|3 |SAHRA CONFORT : KORBA |3 |SAHRA CONFORT : KORBA |
|34 |SOUKRA SQUARE |34 |SOUKRA SQUARE |
|59 |SAHRA CONFORT : ZARZIS|59 |SAHRA CONFORT : ZARZIS|
|8 |Jerba |8 |Jerba |
|22 |Moknine |22 |Moknine |
|28 |RDAYEF |28 |RDAYEF |
|85 |MONASTIR ABSORBA |85 |MONASTIR ABSORBA |
|16 |BARDO HANAYA |16 |BARDO HANAYA |
|35 |Mini M Agba |35 |Mini M Agba |
+---+----------------------+----+----------------------+
I did this
val temp = join.withColumn("changes", when($"LIBELLE" === $"name", lit("nothing")).otherwise("need an update"))
I got this
|id |name |SITE|LIBELLE |changes |
+---+----------------------+----+----------------------+--------------+
|51 |Ezzahra |51 |Ezzahra |nothing |
|7 |BENIKHALLED |7 |BENIKHALLED |nothing |
|15 |Kram |15 |Kram |nothing |
|54 |El Mourouj |54 |El Mourouj |nothing |
|11 |LE BARDO |11 |LE BARDO |nothing |
|29 |Mini M Ksar said |29 |Mini M Ksar said |nothing |
|69 |ZAGHOUAN |69 |ZAGHOUAN |nothing |
|42 |BEB EL KHADHRA |42 |BEB EL KHADHRA |nothing |
|73 |Zaouit Kontech |73 |Zaouit Kontech |need an update|
|87 |Aouina |87 |Aouina |nothing |
|64 |Sousse I I |64 |Sousse I I |nothing |
|3 |SAHRA CONFORT : KORBA |3 |SAHRA CONFORT : KORBA |nothing |
|34 |SOUKRA SQUARE |34 |SOUKRA SQUARE |nothing |
|59 |SAHRA CONFORT : ZARZIS|59 |SAHRA CONFORT : ZARZIS|nothing |
|8 |Jerba |8 |Jerba |nothing |
|22 |Moknine |22 |Moknine |need an update|
|28 |RDAYEF |28 |RDAYEF |nothing |
|85 |MONASTIR ABSORBA |85 |MONASTIR ABSORBA |nothing |
|16 |BARDO HANAYA |16 |BARDO HANAYA |nothing |
|35 |Mini M Agba |35 |Mini M Agba |nothing |
+---+----------------------+----+----------------------+--------------+
I do not why it said that they need to be updated because they are the same. Although it should say nothing for all of them, because they are equal