I approach this using a combination of dbplyr and DBI packages.
The idea is to insert a table query into an insert statement. The insert statement can then be executed on the database:
append_database_table <- function(db_connection, db, schema, tbl_name, table_to_append) {
query <- glue::glue("INSERT INTO {db}.{schema}.{tbl_name}\n{dbplyr::sql_render(table_to_append)}")
# execute
DBI::dbExecute(db_connection, as.character(query))
}
Notes:
- This all assumes that the table you wish to append to already exists in the database. Different syntax is needed if you are creating a new table.
dbplyr::sql_render returns the query that defines the input table
query is a text string of valid database code
- I am using SQL Server syntax. You may need to adjust this if you are using a different flavor of SQL.
- The above is a minimal example, the full function I use for this contains additional checks and ensures column names are correct. You can find it here along with the other support functions I use for dbplyr work.
This approach works even if your input table has some processing. Consider the following example:
prepared_table = remote_table %>%
filter(!is.na(col1)) %>%
group_by(col1) %>%
summarise(col2sum = sum(col2))
show_query(prepared_table)
This will return SQL code equivalent to:
SELECT col1, SUM(col2) AS col2sum
FROM remote_table
WHERE col1 IS NOT NULL
Appending this using:
append_database_table(db_connection, "db", "my_schema", "existing_table", prepared_table)
Will create the following SQL query and execute it on the database:
INSERT INTO db.my_schema.existing_table
SELECT col1, SUM(col2) AS col2sum
FROM remote_table
WHERE col1 IS NOT NULL