I am a newbie at using Spark dataframes. I am trying to use the pivot method with Spark (Spark version 2.x) and running into the following error:
Py4JError: An error occurred while calling o387.pivot. Trace: py4j.Py4JException: Method pivot([class java.lang.String, class java.lang.String]) does not exist
Even though I have the agg function as first here, I really do not need to apply any aggregation.
My dataframe looks like this:
+-----+-----+----------+-----+
| name|value| date| time|
+-----+-----+----------+-----+
|name1|100.0|2017-12-01|00:00|
|name1|255.5|2017-12-01|00:15|
|name1|333.3|2017-12-01|00:30|
Expected:
+-----+----------+-----+-----+-----+
| name| date|00:00|00:15|00:30|
+-----+----------+-----+-----+-----+
|name1|2017-12-01|100.0|255.5|333.3|
The way I am trying:
df = df.groupBy(["name","date"]).pivot(pivot_col="time",values="value").agg(first("value")).show
What is my mistake here?