2

I'm trying to do the same graph over multiple dataframes that have the same variables with different values. I have n dataframes called df_1, df_2 ... df_n and my code goes like this :

#Create dataframes(In this example n = 3)
df_1 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)  
df_2 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)
df_3 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)

##Store dataframes in list
example.list<-lapply(1:3, function(x) eval(parse(text=paste0("df_", x)))) #In order to store all datasets in one list using their name
names(example.list)<-lapply(1:3, function(x) paste0("df_", x))

#Graph and save for each dataframe
for (i in example.list){
  benp <-  ggplot(i, aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(b1), y=0, label=round(mean(b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(i,"_histogram.png"))
}   

I'm getting error message "Error in mean(b1): object b1 not found". I don't know how to tell R that b1 comes from dataframe i. Does anybody knows what's wrong with my code or if there is some easier way to plot over multiple dataframes? Thanks in advance!

3
  • Hi Andres, could you please create a reproducible example to share? Commented Feb 4, 2021 at 21:51
  • Hi Rex. I changed my post so it can be a reproducible example Commented Feb 4, 2021 at 22:23
  • No worries, @Andres. Please see my answer below and accept it if it works for you. Commented Feb 5, 2021 at 1:45

2 Answers 2

2

Your problem wasn't in the iteration over the list of dataframes, it was in the use of b1 within the annotate(). Here, I've created a new dataframe within each loop, and called the column name specifically. There is probably a nicer way of doing this, though. Also, the ggsave() needed to call the names of the items in the list, specifically.

library(tidyverse)

#Create dataframes(In this example n = 3)
df_1 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)  
df_2 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)
df_3 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)

##Store dataframes in list
example.list<-lapply(1:3, function(x) eval(parse(text=paste0("df_", x)))) #In order to store all datasets in one list using their name
names(example.list)<-lapply(1:3, function(x) paste0("df_", x))

#Graph and save for each dataframe

for (i in 1:length(example.list)){
  df_i <- example.list[[i]]
  benp <-  
    df_i %>%
    ggplot(aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(df_i$b1), y=0, label=round(mean(df_i$b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(names(example.list)[i],"_histogram.png"))
}
Sign up to request clarification or add additional context in comments.

Comments

0

The get() function is what you are looking for, to evaluate a string as a dataframe.

get() Return the Value of a Named Object

For example:

x <- "iris"
summary(get(x))
#  Sepal.Length    Sepal.Width     Petal.Length    Petal.Width          Species  
# Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100   setosa    :50  
# 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300   versicolor:50  
# Median :5.800   Median :3.000   Median :4.350   Median :1.300   virginica :50  
# Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199                  
# 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800                  
# Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  

Your example:

#Store dataframes in list
graph.list<-lapply(1:10, function(x) eval(parse(text=paste0("data_new", x)))) #In order to store all datasets in one list using their name
names(graph.list)<-lapply(1:10, function(x) paste0("data_new", x))

#Graph and save for each dataframe
for (i in graph.list){
  benp <-  ggplot(get(i), aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(b1), y=0, label=round(mean(b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(i,"_histogram.png"))
}

4 Comments

Thanks for the response! When I run the code I get the following message: Error in get(i): invalid first argument. Do you know what could be wrong? Thanks again!
Check what the for loop is doing... for (i in graph.list){ print(i) }
Ok, it prints all the dataframes saved in graph.list
Does the ggplot work, outside of the loop, if you manually put in df_1?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.