0

I have a dataframe with a column named source_system that has the values contained in the keys of this Map:

val convertSourceSystem = Map (
        "HON_MUREX3FXFI"  -> "MX3_FXFI",
        "MAD_MUREX3FXFI"  -> "MX3_FXFI",
        "MEX_MUREX3FXFI"  -> "MX3_LT",
        "MX3BRASIL"       -> "MX3_BR",
        "MX3EUROPAEQ_MAD" -> "MX3_EQ",
        "MX3EUROPAEQ_POL" -> "MX3_EQ",
        "MXEUROPA_MAD"    -> "MX2_EU",
        "MXEUROPA_PT"     -> "MX2_EU",
        "MXEUROPA_UK"     -> "MX2_EU",
        "MXLATAM_CHI"     -> "MX2_LT",
        "MXLATAM_NEW"     -> "MX2_LT",
        "MXLATAM_SOV"     -> "MX2_LT",
        "POR_MUREX3FXFI"  -> "MX3_FXFI",
        "SHN_MUREX3FXFI"  -> "MX3_FXFI",
        "UK_MUREX3FXFI"   -> "MX3_FXFI",
        "SOV_MX3LATAM"    -> "MX3_LT"
    )

I need to replace them to the short code, and using a foldLeft to do a withColumn is giving me only null values, because its replacing all the values and the last source_system is not in the map:

val ssReplacedDf = irisToCreamSourceSystem.foldLeft(tempDf) { (acc, filter) =>
      acc.withColumn("source_system", when( col("source_system").equalTo(lit(filter._1)),
          lit(filter._2)))
    }
0

1 Answer 1

2

I would suggest another solution by joining the translation table :

// convert Map to a DataFrame
val convertSourceSystemDF = convertSourceSystem.toSeq.toDF("source_system","source_system_short")

tempDf.join(broadcast(convertSourceSystemDF),Seq("source_system"),"left")
  // override column with short name, alternatively use withColumnRenamed
  .withColumn("source_system",$"source_system_short")
  .drop("source_system_short)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.