Copying columns values from one dataframe into another dataframe in Spark + Scala

Question

I would like to merge 2 spark dataframes (scala). The first data frame contains only 1 row. The second dataframe has multiple rows. I would like to merge these and copy the address / phone column values in the first dataframe to all the rows in second dataframe. Is there a way do it using Spark operations?

DF1

name age address phone
ABC  25  XYZ     00000

DF2

    name   age

    Bill   30
    Steve  40
    Jackie 50

Final DF

name  age address phone
ABC    25  XYZ     00000
Bill   30  XYZ     00000
Steve  40  XYZ     00000
Jackie 50  XYZ     00000

NikitaVeremiev · Accepted Answer · 2022-04-21 19:04:31Z

1

There is a simple way to do it:

import org.apache.spark.sql.functions.lit

val row = df1.select("address", "phone").collect()(0)
val finalDF = df2.withColumn("address", lit(row(0)))
       .withColumn("phone", lit(row(1))).union(df1)

edited Apr 21, 2022 at 19:04

answered Apr 21, 2022 at 18:40

NikitaVeremiev

513 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Paweł Cieśla · Accepted Answer · 2022-04-22 06:20:04Z

0

You can use simple left .join by name with df2 on the left side (with age which you get from df1):

val results = df2.select("name", "address", "phone")
              .join(df1, Seq("name"), "left")

answered Apr 22, 2022 at 6:20

Paweł Cieśla

161 silver badge5 bronze badges

Collectives™ on Stack Overflow

Copying columns values from one dataframe into another dataframe in Spark + Scala

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related