2

I want to make a summary of a larger table using SQL query with sqldf package in R.

The larger table iterationresults has following columns: Truck_ID, Latitude, Longitude, Speed, Idle_Events, Date_Time, state, od, trip_id.

Sample table

Truck_ID Latitude Longitude Speed Idle_Events Date_Time           state od trip_id
TTI 039  31.70117 -106.3685 0     NA          2017-03-29 14:37:30 stop  0  217
TTI 039  31.70119 -106.3685 0     0           2017-03-29 14:37:31 stop  0  217
TTI 039  31.70120 -106.3685 0     0           2017-03-29 14:37:32 stop  0  217
TTI 039  31.70120 -106.3685 0     0           2017-03-29 14:37:33 stop  0  217
TTI 039  31.70119 -106.3685 0     1           2017-03-29 14:37:34 stop  0  217
TTI 039  31.70120 -106.3685 0     1           2017-03-29 14:37:35 stop  0  217
TTI 039  31.70120 -106.3685 0     1           2017-03-29 14:37:36 stop  0  217
TTI 039  31.70121 -106.3685 0     1           2017-03-29 14:37:37 stop  0  217
TTI 039  31.70121 -106.3685 0     1           2017-03-29 14:37:38 stop  0  217
TTI 039  31.70122 -106.3685 0     1           2017-03-29 14:37:39 stop  0   217

The row count is 49258. I need to make a summary table based on trip_id. I am trying to run the following SQL query with sqldf package in R to make a new summary table trips.

SQL <- "SELECT Avg(speed) as [Average Speed]
        FROM iterationresults
        GROUP BY trip_id
        ORDER BY trip_id"
trips <-sqldf(SQL)

I am getting a error saying:

Error in rsqlite_bind_rows(rs@ptr, value) : Parameter 6 does not have length 49258.

I am not sure whats wrong here. I am new to using this package.

6
  • Show us the data? do a print of trips? Commented Jan 30, 2018 at 15:51
  • nothing wrong with your query. dput(iterationresults) and share output. Commented Jan 30, 2018 at 16:06
  • dput(iteration results) .Names = c("Truck_ID", "Latitude", "Longitude", "Speed", "Idle_Events", "Date_Time", "state", "od", "trip_id"), row.names = c(NA, -49258L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000000002590788>) Commented Jan 30, 2018 at 16:16
  • 1
    There is a big difference between mysql and sql-server but you taged both Commented Jan 30, 2018 at 16:33
  • Hard to tell based on the info given, but it looks like the number of rows in your summary table is 49258 based on the error but your SQL Query result has fewer rows because of the aggregation function, which will throw an error when using the '<-' assignment operator to create a new column on a data frame Commented Jan 30, 2018 at 16:55

1 Answer 1

3

It's because the data.frame contains POSIXlt type (Date_Time column). I started to see this bug after adding POSIXlt to my data.frame as well.

I am not exactly sure if it's a bug or a "feature"; but I found this bug-report which explains it: https://github.com/r-dbi/RSQLite/issues/246

I posted there with a follow-up question about the problem.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.