2

I have a dataframe with data as follows.

+---------------+-------+
|category       |marks  |
+---------------+-------+
|cricket        |1.0    |
|tennis         |1.0    |
|football       |2.0    |
+---------------+-------+

I want to write the above dataframe into a csv file where file name will be created with current timestamp.

generatedDataFrame.write.mode ("append")
    .format("com.databricks.spark.csv").option("delimiter", ";").save("./src/main/resources-"+LocalDateTime.now()+".csv")

But this code is not working properly. Giving the following error

java.io.IOException: Mkdirs failed to create file

Is there a better way to achieve this using scala and spark? Also even though I am trying to create the file with timestamp code is creating a directory with the timestamp and inside that directory a csv with data is created with a random name. how can I have the timestamp filename to these csv files instead of creating a directory?

2 Answers 2

2

DF.write.csv will always create a folder with the name you specified and places the output csv files in that folder.

If you want single csv file as a output with the name as timestamp then you can use below code:

import java.text.SimpleDateFormat
import java.util.Date
import org.apache.spark.sql._
import org.apache.hadoop.fs.{FileSystem, Path}

val spark = SparkSession.builder().master("local[*]").getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)

generatedDataFrame.coalesce(1).write.mode("append").csv("./src/main/resources/outputcsv/")

val outFileName = fs.globStatus(new Path("./src/main/resources/outputcsv/part*"))(0).getPath.getName

val timestamp = new SimpleDateFormat("yyyyMMddHHmm").format(new Date())

fs.rename(new Path(s"./src/main/resources/outputcsv/$outFileName"), new Path(s"./src/main/resources/outputcsv/${timestamp}.csv"))
Sign up to request clarification or add additional context in comments.

Comments

-1

You should be using src/main/resources and not ./src/main/resources. You can check the permissions for directory creation from command line. Also, using LocalDateTime.now directly in path will look something like this "2021-03-01T13:39:09.646", not sure if this is what you want or even if it is valid for HDFS paths(chars like [:]), so would suggest to use date-formatting as well.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.