How to store SQL query results inthe excel or csv file?

Question

I am a newbie to data bricks and trying to write results into the excel/ CSV file using the below command but getting errors while executing.

I am using notebook to execute my SQL queries and now want to store store results in the CSV or excel file

 select * from tablename
 df.write.format("csv").save("/tmp/spark_output/datacsv")

Rakesh Govindula · Accepted Answer · 2022-07-14 19:57:24Z

3

You can use spark.table("mytable") or spark.sql("select * from mytable") to store the sql table as dataframe after creating sql table.

This is my sample SQL table:

enter image description here

Using spark.table("mytable"):

enter image description here

Using spark.sql("select * from mytable):

enter image description here

Then save the dataframe as csv using your code.

df1.write.format("csv").mode("overwrite").save("/tmp/spark_output/datacsv")

But in this approach the spark will create multiple csv's of our data like this.

enter image description here

To get a single csv file you can use coalse(1), but if your data is small, you can use pandas here.

import pandas
pandas_converted=df.toPandas()
pandas_converted.to_csv("/dbfs/tmp/spark_output/mycsv.csv")

enter image description here

Make sure you add /dbfs at start of the file, otherwise it won't detect the path here.

Single csv in the path:

enter image description here

SOURCES and REFERENCES:

Writing to a CSV file by Jeremy Peach(analyticjeremy).

edited Jul 14, 2022 at 19:57

answered Jul 14, 2022 at 19:41

Rakesh Govindula

11.9k2 gold badges5 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

9 Comments

SeleniumUser Over a year ago

Error in SQL statement: ParseException: mismatched input 'spark' expecting == SQL == spark.table("SELECT * FROM customer").show()

Rakesh Govindula Over a year ago

In spark.table("table_name"), You have to give your sql table name i,e customer in your case. select * from customer this code should be given in spark.sql(). Please check my code above in both methods.

SeleniumUser Over a year ago

I have tried df =spark.sql('select * from customer') df.show() but still its failing, I am using databricks notebook and language is SQL

Rakesh Govindula Over a year ago

spark.sql() and spark.table(), these are spark methods. Change the language of that particular cell to python by clicking on sql on right top of the cell as shown in this and try the code. For creating sql table you can change it to sql in other cell.

Rakesh Govindula Over a year ago

By default for every cell, it will give SQL langauge as you have given when creating notebook. you can set whatever langauge you want for each cell as shown in that link. For executing the above pyspark code it should be python language.

|

Collectives™ on Stack Overflow

How to store SQL query results inthe excel or csv file?

1 Answer 1

9 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

9 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related