1

I've an issue with Spark SQL, where the column type if I typecast from string to timestamp, the value becomes NULL. Below are the details:

val df2 = sql("""select FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss')""")
df2: org.apache.spark.sql.DataFrame = [from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss): string]


scala> df2.show
+----------------------------------------------------------------------------------------------------------------------------------------+
|from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss)|
+----------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                       20181001 00:00:00|
+----------------------------------------------------------------------------------------------------------------------------------------+

When typecasting to timestamp explicitly, it won't give me the desired result.

val df2 = sql("""select cast(FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss') as timestamp)""")
df2: org.apache.spark.sql.DataFrame = [CAST(from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss) AS TIMESTAMP): timestamp]


scala> df2.show
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|CAST(from_unixtime(unix_timestamp(to_date(last_day(add_months(CAST(concat_ws(-, 2018, 10, 01) AS DATE), 0))), yyyy-MM-dd), yyyyMMdd HH:mm:ss) AS TIMESTAMP)|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                                                                       null|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------+

Any idea to resolve it?

2 Answers 2

1

Try the following:

val df2 = spark.sql(
      """select CAST(unix_timestamp(FROM_UNIXTIME(UNIX_TIMESTAMP(to_date(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','10','01'),0))),'yyyy-MM-dd'),'yyyyMMdd HH:mm:ss'),'yyyyMMdd HH:mm:ss') as timestamp) as destination""".stripMargin)

df2.show(false)
df2.printSchema()

+-------------------+
|destination        |
+-------------------+
|2018-10-31 00:00:00|
+-------------------+

root
 |-- destination: timestamp (nullable = true)
Sign up to request clarification or add additional context in comments.

Comments

1

I tried it like this, without using any spark internals.

val df2 = sql("""cast(FROM_UNIXTIME(UNIX_TIMESTAMP(cast(LAST_DAY(ADD_MONTHS(CONCAT_WS('-','2018','12','31'),0)) as timestamp))) as timestamp)""")

scala> df2.show
+--------------------+
|2018-12-31 00:00:...|
+--------------------+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.