1

I have a large SQLite database with many tables. I have established a connection to this database in RStudio using RSQLite and DBI packages. (I have named this database db)

library(RSQLite)
library(DBI)

At the moment I have to read in all the tables and assign them a name manually. For example:

country <- dbReadTable(db, "country")
date <- dbReadTable(db, "date")
#...and so on

You can see this can be a very time-consuming process if you were to have many tables.

So I was wondering if it is possible to create a new function or using existing functions (e.g. lapply() ?) to complete this more efficiently and essentially speed up this process?

Any suggestions are much appreciated :)

2 Answers 2

5

Two mindsets:

  1. All tables/data into one named-list:

    alldat <- lapply(setNames(nm = dbListTables(db)), dbReadTable, conn = db)
    

    The benefit of this is that if the tables have similar meaning, then you can use lapply to apply the same function to each. Another benefit is that all data from one database are stored together.

    See How do I make a list of data frames? for working on a list-of-frames.

  2. If you want them as actual variables in the global (or enclosing) environment, then take the previous alldat, and

    ign <- list2env(alldat, envir = .GlobalEnv)
    

    The return value from list2env is the environment that we passed to list2env, so it's not incredibly useful in this context (though it is useful other times). The only reason I capture it into ign is to reduce the clutter on the console ... which is minor. list2env works primarily in side-effect, so the return value in this case is not critical.

Sign up to request clarification or add additional context in comments.

1 Comment

perfect! exactly what I was looking for :)
4

You can use dbListTables() to generate a character vector of all your table names in your SQLite database and use lapply() to import them into R efficiently. I would first check you are able to import all the tables in your database into memory.

Below is an reproducible example of this:

library(RSQLite)
library(DBI)

db <- dbConnect(RSQLite::SQLite(), ":memory:")
dbWriteTable(db, "mtcars", mtcars)
dbWriteTable(db, "iris", iris)
db_tbls <- dbListTables(db)

tbl_list <- lapply(db_tbls, dbReadTable, conn = db)
tbl_list <- setNames(tbl_list, db_tbls)

dbDisconnect(db)

> lapply(tbl_list, head)
$iris
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

$mtcars
   mpg cyl disp  hp drat    wt  qsec vs am gear carb
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

2 Comments

Is there a way to do exactly the same, but only get the first few rows of each table?
one option is instead of using dbReadTable in lapply(...) you can use dbGetQuery and use a SQL statement to get N rows from each table

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.