How to add/append a new row to a DataFrame in Scala without using a SQL insert?

Question

I have a DataFrame created in the following way.

val someDF = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
someDF.printSchema
root
 |-- number: integer (nullable = false)
 |-- word: string (nullable = true)

Using SQL API, one can insert a row into it by creating a temp table and running an insert query. Is there any way one can append/add a new row using methods of the DataFrame API ?

Note: DataFrames/Datasets are part of what is called "Spark SQL". You should say "in Spark SQL, how to [...] without using SQL API/SQL statement" — ebonnal
– ebonnal, Commented Jan 23, 2020 at 13:59

sachav · Accepted Answer · 2020-01-23 12:53:21Z

9

You can use union:

val someDF = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
someDF.union(Seq((10, "dog")).toDF).show
/*
+------+-----+
|number| word|
+------+-----+
|     8|  bat|
|    64|mouse|
|   -27|horse|
|    10|  dog|
+------+-----+
*/

answered Jan 23, 2020 at 12:53

sachav

1,3168 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

user238607 Over a year ago

You can add the caveat that the column order in both the dataframes must be same. Spark doesn't warn users if the column names are in different order. It warns only if there is column type mismatch. Something to keep in mind.

Collectives™ on Stack Overflow

How to add/append a new row to a DataFrame in Scala without using a SQL insert?

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related