I'm attempting to write a Spark Scala DataFrame Column as an array of bytes. I have a DataFrame that consists of two columns. The first column is a string and the second is a Map from Strings to Longs.
For example,
user_id | map
"ac2" | Map("c2" -> 1, "b3" -> 5)
I want to write the map column as an array of bytes. So far I've attempted to use Jackson with the following UDF:
val writeJackson = udf { x: Map[String, Long] =>
jacksonWriter.writeValueAsBytes(x)
}
val df2 = df.withColumn("jacksonMap", writeJackson($"map"))
but this fails because of
java.io.NotSerializableException: com.fasterxml.jackson.module.paranamer.shaded.CachingParanamer
Is there a way to get this to work with Jackson, and if not is there a different library that will let me write this Spark column as a byte array?