1

My goal is to create a list with each element containing a dataframe.
The dataframes are created by calling sqldf iteratively.

An example of what I want to do is this:
I have a vector names containing the names of my list.

> names
[1] "hello" "world"` 

The list is called L, and is of length length(names).
Right now, L looks like this

> L
[[1]]
[1] 0

[[2]]
[1] 0  

I want it to look like:

> L
$hello
  Year Total
1 2000   100
2 2001   200

$world
  Year Total
1 2000   150
2 2001   250 

The first element L$hello is created by calling

names(L)[1] <- "hello"

L$hello <- sqldf(select Year, sum(case when names='hello' then Nums) as Total from Data group by Year")

Similarly, the second element L$world is created by replacing "'hello'" in that function call with "'world'".

However, this is a big problem if I have a lot of names.

My attempt to iterate this is here:

for (i in names) {

    j=j+1
    names(L)[j] <- i
    L[[j]] <- sqldf("select Year, sum(case when names='names[names == i]' then Nums end) as 'Total' from Data group by Year")

}

The problem is definitely in the third line in the for loop where I have the names='names[names == i]' argument. How would I amend this?

I think it boils down to: How do I "paste" a string into a function call?

e.g. Instead of doing:

sqldf("select Year, sum(case when names='hello' then Nums end) as 'Total' from Data group by Year")

if I have a variable x where x <- "hello", how would I "paste" x into the sqldf function?

0

2 Answers 2

1

The sqldf package automatically loads the gsubfn package which provides fn$ for string interpolation. Preface sqldf with fn$ and then in the SQL string use

  1. $ for a straight substitution or
  2. backquotes to execute the code between the backquotes replacing all that with the output of the code.

Note that fn$ is a general facility that can preface just about any function to pre-process its arguments -- it is not specific to sqldf.

Here are some examples. Note that BOD and iris are built into R.

library(sqldf)

a <- 3
fn$sqldf("select * from BOD where Time > $a")
##   Time demand
## 1    4   16.0
## 2    5   15.6
## 3    7   19.8

fn$sqldf("select * from BOD where Time > `a+1`")
##   Time demand
## 1    5   15.6
## 2    7   19.8

irisType <- "setosa"
fn$sqldf("select sum([Petal.Length]) from iris where Species = '$irisType'")
##   sum([Petal.Length])
## 1                73.1

If you want to see the final string that is passed to sqldf add the argument verbose = TRUE to the sqldf call.

Sign up to request clarification or add additional context in comments.

Comments

0

You can use glue and map over your names vector

library(sqldf)
library(glue)
library(purrr)

map(setNames(my.names, my.names), ~
    "select sum(case when a = '{.x}' then b end) as Total 
     from df" %>% 
      glue %>% 
      sqldf)

# $`hello`
#   Total
# 1    24
# 
# $world
#   Total
# 1    31

You can do this without glue or purrr but to my eye it looks a little uglier

lapply(setNames(my.names, my.names), function(x)
  sqldf(paste0("select sum(case when a = '", x, "' then b end) as Total 
                from df")))
# $`hello`
#   Total
# 1    24
# 
# $world
#   Total
# 1    31

Example data used in this answer:

my.names <- c("hello", "world")
set.seed(1)
df <- data.frame(a = sample(my.names, 10, T), b = sample(1:10))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.