0

I have a dataset with a bunch of BigDecimal values. I would like to output these records to a JSON file, but when I do the BigDecimal values will often be written with trailing zeros (123.4000000000000), but the spec we are must conform to does not allow this (for reasons I don't understand).

I am trying to see if there is a way to override how the data is printed to JSON. Currently, my best idea is to convert each record to a string using JACKSON and then writing the data using df.write().text(..) rather than JSON.

1 Answer 1

1

I suggest to convert Decimal type to String before writing to JSON.

Below code is in Scala, but you can use it in Java easily

import org.apache.spark.sql.types.StringType

# COLUMN_NAME is your DataFrame column name.

val new_df = df.withColumn('COLUMN_NAME_TMP', df.COLUMN_NAME.cast(StringType)).drop('COLUMN_NAME').withColumnRenamed('COLUMN_NAME_TMP', 'COLUMN_NAME')
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.