I want to loop over variables within a data frame either using a for loop or function in R. I have coded the following (which doesn't work):
y <- c(0,0,1,1,0,1,0,1,1,1)
var1 <- c("a","a","a","b","b","b","c","c","c","c")
var2 <- c("m","m","n","n","n","n","o","o","o","m")
mydata <- data.frame(y,var1,var2)
myfunction <- function(v){
regressionresult <- lm(y ~ v, data = mydata)
summary(regressionresult)
}
myfunction("var1")
When I try running this, I get the error message:
Error in model.frame.default(formula = y ~ v, data = mydata, drop.unused.levels = TRUE) :
variable lengths differ (found for 'v')
I don't think this is a problem with the data, but with how I refer to the variable name because the following code produces the desired regression results (for one variable that I wanted to loop over):
regressionresult <- lm(y ~ var1, data = mydata)
summary(regressionresult)
How can I fix the function, or put the variables names in the loop?
[I also tried to loop over the variables names, but had a similar problem as with the function:
for(v in c("var1","var2")){
regressionresult <- lm(y ~ v, data = mydata)
summary(regressionresult)
}
When running this loop, it produces the error:
Error in model.frame.default(formula = y ~ v, data = mydata, drop.unused.levels = TRUE) :
variable lengths differ (found for 'v')
Thanks for your help!
lmline withregressionresult <- lm(y ~ get(v), data = mydata)