1

I'm pretty sure a similar query has been asked already. However my question is specific to when I connect an R session to my SQL Server database.

accept<-x%>% 

  group_by(SiteID, MachineID, LocationID) %>%

  filter(DateTime>="2019-01-1" & DateTime<"2019-12-31") %>%

  summarise(n=(sum(TenSecCount))) %>%

  collect()

When I try to collect the data into a data frame, I get the following error --

Arithmetic overflow error converting expression to data type int.
'SELECT "SiteID", "MachineID", "LocationID", CAST("n" AS VARCHAR(MAX)) AS "n"
FROM (SELECT TOP 100 PERCENT "SiteID", "MachineID", "LocationID", (SUM("TenSecCount")) AS "n"
FROM (SELECT TOP 100 PERCENT *
FROM (SELECT TOP 100 PERCENT "MachineID", "OutletID", "TenSecCount"

Any workarounds for this?

2
  • Do you DateTime as POSIXct class Try filter(DateTime>=as.Date("2019-01-01") & DateTime< as.Date("2019-12-31")) Commented Jun 8, 2020 at 21:41
  • What is the format of TenSecCount? If this column is not of type int then attempting to sum it is the likely cause of the error. Commented Jun 9, 2020 at 1:10

1 Answer 1

1

This error arises when you collect the data into R, because it is only at collection that your remote SQL table is evaluated.

dbplyr defines your remote table by the SQL query that would return your results. Until you ask for results to be returned, your remote table definition is not much different from an SQL script waiting to be run.

When you request results from that table, in your case using collect, the sql code is executed on the server and the results returned to R. This means you can have an invalid remote table definition, and not know it until you execute it. E.g.:

remote_table <- server_df %>% 
  group_by(SiteID, MachineID, LocationID) %>%
  filter(DateTime>="2019-01-1" & DateTime<"2019-12-31") %>%
  summarise(n=(sum(TenSecCount)))
# no error because all we have done is define an sql query and store it in remote_table

# review underlying sql query
show_query(remote_table)
# if you copy & paste this query and try to run it directly on the server it will error

# attempt to collect data
local_table <- remote_table %>% collect()
# error occurs on evaluation

You can tell the error arose when the sql is evaluated because R returned to you the sql code that caused the error and the sql error message. "Arithmetic overflow error converting expression to data type int." is an SQL error, not an R error. See this question for ways to solve it.

Hint, probably something like:

remote_table <- server_df %>% 
  mutate(TenSecCount = CAST(TenSecCount AS BIGINT)) %>% # additional step changing data type
  group_by(SiteID, MachineID, LocationID) %>%
  filter(DateTime>="2019-01-1" & DateTime<"2019-12-31") %>%
  summarise(n=(sum(TenSecCount)))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.