You can use a simple select in combination with Scala Map. Is easier to handle the column transformations via a dictionary (Map) is which key will be the old name and value the new name.
Lets create first the two datasets as you described them:
val df1 = Seq(
("toto", 23, "g", "2010-06-09"),
("bla", 35, "s", "1990-10-01"),
("pino", 12, "a", "1995-10-05")
).toDF("fname", "age", "class", "dob")
val df2 = Seq(
("toto", 23, "g", "2010-06-09"),
("bla", 35, "s", "1990-10-01"),
("pino", 12, "a", "1995-10-05")
).toDF("f_name", "user_age", "class", "DataofBith")
Next we have created a Scala function named transform which accept two arguments, the target df and mapping which contains the transformations details:
val mapping = Map(
"fname" -> "first_name",
"f_name" -> "first_name",
"user_age" -> "age",
"DataofBith" -> "dob"
)
def transform(df: DataFrame, mapping: Map[String, String]) : DataFrame = {
val keys = mapping.keySet
val cols = df.columns.map{c =>
if(keys.contains(c))
df(c).as(mapping(c))
else
df(c)
}
df.select(cols:_*)
}
The function goes through the given columns checking first whether the current column exists in mapping. If so, it renames using the corresponding value from the dictionary otherwise the column remains untouched. Note that this will just rename the column (via alias) hence we don't expect to affect performance.
Finally, some examples:
val newDF1 = transform(df1, mapping)
newDF1.show
// +----------+---+-----+----------+
// |first_name|age|class| dob|
// +----------+---+-----+----------+
// | toto| 23| g|2010-06-09|
// | bla| 35| s|1990-10-01|
// | pino| 12| a|1995-10-05|
// +----------+---+-----+----------+
val newDF2 = transform(df2, mapping)
newDF2.show
// +----------+---+-----+----------+
// |first_name|age|class| dob|
// +----------+---+-----+----------+
// | toto| 23| g|2010-06-09|
// | bla| 35| s|1990-10-01|
// | pino| 12| a|1995-10-05|
// +----------+---+-----+----------+