1

I have a DataFrame created in the following way.

val someDF = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
someDF.printSchema
root
 |-- number: integer (nullable = false)
 |-- word: string (nullable = true)

Using SQL API, one can insert a row into it by creating a temp table and running an insert query. Is there any way one can append/add a new row using methods of the DataFrame API ?

4
  • Do you want to add a row or a column? Commented Jan 23, 2020 at 13:01
  • Sorry. It should be row. I made changes to the question. Commented Jan 23, 2020 at 13:03
  • 1
    Note: DataFrames/Datasets are part of what is called "Spark SQL". You should say "in Spark SQL, how to [...] without using SQL API/SQL statement" Commented Jan 23, 2020 at 13:59
  • @EnzoBnl, Sure. I will use the terms properly from here on. Commented Jan 23, 2020 at 15:52

1 Answer 1

9

You can use union:

val someDF = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
someDF.union(Seq((10, "dog")).toDF).show
/*
+------+-----+
|number| word|
+------+-----+
|     8|  bat|
|    64|mouse|
|   -27|horse|
|    10|  dog|
+------+-----+
*/
Sign up to request clarification or add additional context in comments.

1 Comment

You can add the caveat that the column order in both the dataframes must be same. Spark doesn't warn users if the column names are in different order. It warns only if there is column type mismatch. Something to keep in mind.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.