0

I am a newbie to data bricks and trying to write results into the excel/ CSV file using the below command but getting errors while executing.

I am using notebook to execute my SQL queries and now want to store store results in the CSV or excel file

 select * from tablename
 df.write.format("csv").save("/tmp/spark_output/datacsv")

1 Answer 1

3

You can use spark.table("mytable") or spark.sql("select * from mytable") to store the sql table as dataframe after creating sql table.

This is my sample SQL table:

enter image description here

Using spark.table("mytable"):

enter image description here

Using spark.sql("select * from mytable):

enter image description here

Then save the dataframe as csv using your code.

df1.write.format("csv").mode("overwrite").save("/tmp/spark_output/datacsv")

But in this approach the spark will create multiple csv's of our data like this.

enter image description here

To get a single csv file you can use coalse(1), but if your data is small, you can use pandas here.

import pandas
pandas_converted=df.toPandas()
pandas_converted.to_csv("/dbfs/tmp/spark_output/mycsv.csv")

enter image description here

Make sure you add /dbfs at start of the file, otherwise it won't detect the path here.

Single csv in the path:

enter image description here

SOURCES and REFERENCES:

Writing to a CSV file by Jeremy Peach(analyticjeremy).

Sign up to request clarification or add additional context in comments.

9 Comments

Error in SQL statement: ParseException: mismatched input 'spark' expecting == SQL == spark.table("SELECT * FROM customer").show()
In spark.table("table_name"), You have to give your sql table name i,e customer in your case. select * from customer this code should be given in spark.sql(). Please check my code above in both methods.
I have tried df =spark.sql('select * from customer') df.show() but still its failing, I am using databricks notebook and language is SQL
spark.sql() and spark.table(), these are spark methods. Change the language of that particular cell to python by clicking on sql on right top of the cell as shown in this and try the code. For creating sql table you can change it to sql in other cell.
By default for every cell, it will give SQL langauge as you have given when creating notebook. you can set whatever langauge you want for each cell as shown in that link. For executing the above pyspark code it should be python language.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.