4

I'm using spark 2.4.1 and scala, and trying to write DF to csv file. it seems that in case of null values ,the csv contains "". Is it possible to remove those empty quotes?

 val data = Seq(
      Row(1, "a"),
      Row(5, "z"),
      Row(5, null)
    )

    val schema = StructType(
      List(
        StructField("num", IntegerType, true),
        StructField("letter", StringType, true)
      )
    )

    var df = spark.createDataFrame(
      spark.sparkContext.parallelize(data),
      schema
    )
  df.write.csv("location/")

The output seems like:

1,a
5,z
5,""

And I want it will be:

1,a
5,z
5,

What should I do?

Thanks!

2 Answers 2

9

You can use options of the writer see CSV specific options(SaveMode is not related to answer);

 df.write
   .option("nullValue", null)
   .mode(SaveMode.Overwrite)
   .csv("location/")
Sign up to request clarification or add additional context in comments.

Comments

-1

Try this:

df.write.option("nullValue",None).save("location/")

1 Comment

Please include an explanation for your code as it makes your answer more helpful to future readers in general. In particular, it'd be great if you extend this answer with an explanation why this solution is a better approach than the one given in the accepted answer. See also the contribution guide for more details.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.