Given the dataframe as
+----+------------------+-----+
|ID |timestamp |value|
+----+------------------+-----+
|ID-1|8/23/2017 14:48:13|4.56 |
|ID-2|8/23/2017 6:5:21 |5.92 |
|ID-3|8/23/2017 5:49:13 |6.0 |
+----+------------------+-----+
and current time as
2017-08-23 14:53:33
You can do the following
import org.apache.spark.sql.functions._
df.withColumn("timestamp", unix_timestamp($"timestamp", "MM/dd/yyyy HH:mm:ss"))
.filter((unix_timestamp(current_timestamp()) - $"timestamp")/60 < 10)
.select($"ID", $"timestamp".cast(TimestampType), $"value")
where unix_timestamp will convert the DateTimeStamp to long and current_timestamp() will give the current date and time
You should have final dataframe as
+----+---------------------+-----+
|ID |timestamp |value|
+----+---------------------+-----+
|ID-1|2017-08-23 14:48:13.0|4.56 |
+----+---------------------+-----+
I hope the answer is helpful