Unable to copy dataframe in pyspark to csv file in Databricks

Question

I am working in Pyspark environment in Databricks and have a pyspark data frame which I will call as df.

I need to push this spark dataframe into csv file, I am unable to do so. Though there is no error popping up but the dataframe doesn’t get copied into the csv. Below is the generic code

path = “ “ #CSV File Location
header = “This is the header of the file"
With open(path,”a”) as f:
    f.write(header+”\n”)
    df.write.csv(path=path,format=“csv”,mode=“append”)
    f.close

Only the header gets reflected in the file and not the dataframe

Naman Deep Srivastava · Accepted Answer · 2020-06-21 20:18:53Z

1

You can write your dataframe as csv using this:

df.coalesce(1).write.format("com.databricks.spark.csv").option("header", "true").save("dbfs:/FileStore/df.csv")

Coalesce avoids saving it in multiple partitions. You can put in your own path as parameter in save().

answered Jun 21, 2020 at 20:18

Naman Deep Srivastava

974 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Unable to copy dataframe in pyspark to csv file in Databricks

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related