0

I know this has been asked several times and I consulted/searched the offered questions and answers. I also read up the databricks docu and had several attempts, but I just don't get the desired result.

Given:

+----------------------------+
|        data_type           |
+----------------------------+
|        timestamp           |
+----------------------------+

Ex:

+------------------------------+
|        data_value            |
+------------------------------+
| 2017-11-22T00:00:00.000+0000 |
+------------------------------+

Desired outcome:

+----------------------------+
|        data_value          |
+----------------------------+
|        22.11.2017          |
+----------------------------+

What I tried and failed so far:

  date_format(date_value, 'dd.mm.yyyy') AS MFGDate,

  to_date(date_value) AS MFGDate,

  date(date_value) AS MFGDate

Result:

+-------------------------+------------+
|   MFGDate  |   MFGDate  |   MFGDate  |
+------------+------------+------------+
| 22.00.2017 | 2017-11-22 | 2017-11-22 |
+------------+------------+------------+

Here's the full query:

SELECT
   '01 FUV' AS Stage,
   d1.ps_name AS FUV,
   d1.ps_name AS LOT,
   d2.date_value AS MFGDate
 FROM
  table d1
  INNER JOIN table d2 ON d1.ag_id = d2.ag_id
  AND d1.ag_path = d2.ag_path
  AND d1.ps_name = d2.ps_name
WHERE
  d1.AG_PATH LIKE "sourcepath'

Result:

+--------+--------+--------+------------------------------+
| Stage  | FUV    | Lot    | MFGDate                      | 
+--------+--------+--------+------------------------------+
| 01 FUV | A1U079 | A1U079 | 2019-03-27T00:00:00.000+0000 |
| 01 FUV | A1U255 | A1U255 | 2019-06-22T00:00:00.000+0000 |
| 01 FUV | A1U255 | A1U255 | 2019-11-10T00:00:00.000+0000 |

How to get the value for column MFGDate in the format like: 22.11.2017 ?

3 Answers 3

1

You can use the built in function - date_format , the thing you were missing was the correct Symbol usage . The link explains the symbol usage

Typical Usage

input_list = [
  (1,"2019-11-07 05:30:00")
  ,(2,"2019-07-09 15:30:00")
  ,(3,"2019-12-09 10:30:00")
  ,(4,"2019-02-11 14:30:00")
]


sparkDF = sql.createDataFrame(input_list,['id','date'])

sparkDF = sparkDF.withColumn('date',F.to_timestamp(F.col('date'), 'yyyy-MM-dd HH:mm:ss'))

sparkDF = sparkDF.withColumn('date_formated',F.date_format(F.col('date'), 'dd.MM.yyyy'))

sparkDF.show()

+---+-------------------+-------------+
| id|               date|date_formated|
+---+-------------------+-------------+
|  1|2019-11-07 05:30:00|   07.11.2019|
|  2|2019-07-09 15:30:00|   09.07.2019|
|  3|2019-12-09 10:30:00|   09.12.2019|
|  4|2019-02-11 14:30:00|   11.02.2019|
+---+-------------------+-------------+

Sign up to request clarification or add additional context in comments.

3 Comments

Doesn't work. If I change to "F.date_format(d2.date_value, 'dd/mm/yyyy')" I get the error: " Error in SQL statement: AnalysisException: Undefined function: 'date_format'. This function is neither a registered temporary function nor a permanent function registered in the database 'F'.;" If I query: "Show functions", it's included in the list but when using it, it's undefined. I'm looking for a solution in plain sql. Thanks for your answer though. Or is this runtime related? I'm using 5.5 LTS (includes Apache Spark 2.4.3, Scala 2.11).
By the looks of it , its runtime , and it would better if you update the question with the import statements and subsequent code snippets as well , that you are using to run this. I ll update the answer with a SQL version as well for this
added the full query with some data, that's all I have. I don't import anything, it's just sql
1

You were very close. You can use the built in function - date_format, but the reason you were getting "00" returned for the month is because you had your format incorrect. You specified "mm" which returns minutes of the hour; you should have specified "MM" which returns month of the year. So correct code is:

date_format(date_value, 'dd.MM.yyyy') AS MFGDate

Documentation here: https://spark.apache.org/docs/latest/sql-ref-datetime-pattern.html

Comments

0

Try this please:

from integer: SELECT FROM_UNIXTIME(timestamp_column / 1000) AS converted_date FROM your_table;

from string: SELECT string_timestamp, TO_TIMESTAMP(string_timestamp, 'yyyy-MM-dd HH:mm:ss') AS timestamp_with_time FROM your_table;

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.