I want to make a summary of a larger table using SQL query with sqldf package in R.
The larger table iterationresults has following columns: Truck_ID, Latitude, Longitude, Speed, Idle_Events, Date_Time, state, od, trip_id.
Sample table
Truck_ID Latitude Longitude Speed Idle_Events Date_Time state od trip_id
TTI 039 31.70117 -106.3685 0 NA 2017-03-29 14:37:30 stop 0 217
TTI 039 31.70119 -106.3685 0 0 2017-03-29 14:37:31 stop 0 217
TTI 039 31.70120 -106.3685 0 0 2017-03-29 14:37:32 stop 0 217
TTI 039 31.70120 -106.3685 0 0 2017-03-29 14:37:33 stop 0 217
TTI 039 31.70119 -106.3685 0 1 2017-03-29 14:37:34 stop 0 217
TTI 039 31.70120 -106.3685 0 1 2017-03-29 14:37:35 stop 0 217
TTI 039 31.70120 -106.3685 0 1 2017-03-29 14:37:36 stop 0 217
TTI 039 31.70121 -106.3685 0 1 2017-03-29 14:37:37 stop 0 217
TTI 039 31.70121 -106.3685 0 1 2017-03-29 14:37:38 stop 0 217
TTI 039 31.70122 -106.3685 0 1 2017-03-29 14:37:39 stop 0 217
The row count is 49258. I need to make a summary table based on trip_id. I am trying to run the following SQL query with sqldf package in R to make a new summary table trips.
SQL <- "SELECT Avg(speed) as [Average Speed]
FROM iterationresults
GROUP BY trip_id
ORDER BY trip_id"
trips <-sqldf(SQL)
I am getting a error saying:
Error in rsqlite_bind_rows(rs@ptr, value) : Parameter 6 does not have length 49258.
I am not sure whats wrong here. I am new to using this package.
trips?dput(iterationresults)and share output.dput(iteration results).Names = c("Truck_ID", "Latitude", "Longitude", "Speed", "Idle_Events", "Date_Time", "state", "od", "trip_id"), row.names = c(NA, -49258L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000002590788>)mysqlandsql-serverbut you taged both