extract(field FROM source) - Extracts a part of the date/timestamp or interval source
This function available spark version 3.0.0 & lower version of spark it is not available hence you will get below exception if use extract function.
scala> spark.sql("select extract(year from datecol) as dt from tmp").show(false)
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'from' expecting {')', ','}(line 1, pos 20)
== SQL ==
select extract(year from datecol) as dt from temp
--------------------^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:217)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:114)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:68)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:623)
... 50 elided
Use year function.
spark.sql("select year(datecol) as dt from tmp").show(false)
select year(datecol) as dt