-2

I have a column ORDER_DATE with epoch timestamp in string. How can I convert this column with string like str = "1536309236032" which is time in epoch to a string with format: 2018-09-07T14:03:56.032Z in Scala?

Currently I am using:

from_unixtime(input.col(ORDER_DATE), "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'")

but this converts incorrectly to 50668-08-21 01:10:00.000. Here, it increases the year and increments 000 for milliseconds.

I don't want to divide by 1000 as we would like to have the result in milli seconds.

4
  • Possible duplicate https://stackoverflow.com/questions/33475229/convert-epoch-to-datetime-in-scala-spark Commented Sep 18, 2018 at 13:25
  • I have made my question more clear and it is a different requirement from what link you mentioned. Can you help me in this? Commented Sep 18, 2018 at 14:36
  • From the length of your input time string, it's probably in milliseconds. Try: from_unixtime($"epochstr".cast("long")/1000, "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'") Commented Sep 18, 2018 at 16:01
  • But I want the output in milliseconds as well. Is there a way to get that? Commented Sep 18, 2018 at 17:18

2 Answers 2

1

In the documentation, the definition of from_unixtime is as follows:

Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string representing the timestamp of that moment in the current system time zone in the given format.

It uses seconds and are thus not compatible with milliseconds which is why the result is wrong. To convert the epoch timestamp and keeping the millisecond information, you can use concat:

val spark = SparkSession.builder.getOrCreate()
import spark.implicits._

df.withColumn("time", concat(
    from_unixtime($"ORDER_DATE"/1000, "yyyy-MM-dd'T'HH:mm:ss."), 
    $"ORDER_DATE".substr(length($"ORDER_DATE")-2, length($"ORDER_DATE")), 
    lit("Z")))

This will work since the last 3 digits in the epoch timestamp is the same as those in the wanted result.

Sign up to request clarification or add additional context in comments.

2 Comments

Can you also tell a way so that I perform this action only when the field in the column is not "0" otherwise it leaves the field empty?
@AbhinavKaushalKeshari: You can use when and otherwise, see for example here: stackoverflow.com/questions/37064315/…
0

I got the idea from @Shaido and I did something similar. Finally, this solved the issue for me:

input.withColumn("time",
concat(from_unixtime(input.col("ORDER_DATE")/1000, "yyyy-MM-dd'T'HH:mm:ss"), 
typedLit("."), substring(input.col("ORDER_DATE"), 11, 3), typedLit("Z")))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.