I'm new to Scala and am having a hard time working with a simple dataset in Spark. I want to be able to review the following dataset ordering by EventType and crow, but can't get it to do it by Descending value. I also want to read out just one eventType at a time.
when I try
dataset.orderBy("eventType")
It works, but if I add a '.desc' it doesn't work.
scala> setB.orderBy("eventType").desc
<console>:32: error: value desc is not a member of
org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
setB.orderBy("eventType").desc
or
scala> dataset.orderBy("eventType".desc)
<console>:32: error: value desc is not a member of String
dataset.orderBy("eventType".desc)
I also am trying to use Filter, but it doesn't like anything I try either. something like: dataset.filter("eventType"="agg%")
Sample dataset:
+----------------+------------------------------------------------------------------------------------+-----------------------------------+-------------+----------------+----+
|deadletterbucket|split |eventType |clientVersion|dDeviceSurrogate|crow|
+----------------+------------------------------------------------------------------------------------+-----------------------------------+-------------+----------------+----+
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |4.3.0.108 |1 |3 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |5.3.0.10 |1 |11 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |5.9.1.10 |3 |11 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |5.7.0.1 |3 |15 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |5.5.0.5 |6 |16 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |4.0.0.62 |7 |26 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |4.6.4.6 |9 |31 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_network_traffic|7.12.0.113 |1 |1 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_network_traffic|6.3.2.15 |1 |2 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_network_traffic|5.1.2.10 |1 |3 |
Ideally, I am trying to get something like the following to work
dataset.orderBy("crow").desc.filter("eventType"="%app_launches").show(3,false)
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |5.5.0.5 |6 |31 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |4.0.0.62 |7 |26 |
|event_failure |instance type (null) does not match any allowed primitive type (allowed: ["object"])|aggregate_event.app_launches |4.6.4.6 |9 |16 |