2

How can I make a simple insert in Spark SQL ? spark 2.1

I am able to make it work with simple sql code inside spark, with Spark.sql but it is not possible for me to make just an insert.

  from pyspark.sql import SparkSession
  spark = SparkSession.builder.appName('Basics').getOrCreate()
  df=spark.read.json(/path/.'/people.json')

  df.sow()

  +-----+---------+   
  |age  | name    |
  +-----+---------+
  |null | Michael |
  | 30  | And     |
  +-----+---------+    

 df.CreateOrReplaceTempView('people') # create temp table

 spark.sql("SELECT * FROM people where age == 30")

  +-----+---------+   
  |age  | name    |
  +-----+---------+
  | 30  | Andy    |
  +-----+---------+ 

So I understand SQL but I dont know who to make an Insert.

I tried all the posibles ways I imagine.

2
  • 1
    As far as I know, it depends on the database you are writing too, as each has its own connector (an existing one or needs one to be written). Also, the answer may differ between Spark, Spark Direct Streaming and Spark Structures Streaming. P.S. To reply use "@DannyVarod" at the beginning of your comment. Commented Oct 24, 2018 at 16:02
  • 1
    @DannyVarod Thanks for answer, I m not using anydatabase. its a Dataframe that I convert in a table and then with spark.sql it allows to write sql code. works for "select" but I m trying for insert Commented Oct 24, 2018 at 16:09

1 Answer 1

3

You don't insert into dataframes, they are immutable and lazy.

You need to create a new dataframe which is the union between the original dataframe and the new data you want to add to it.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.