I want to create a loop using variable names instead of numbers but I'm struggling with it.
I have over 1000 variables in my data but the structure looks like this:
#Reproducible data
id <- rep(c("1","2","3","4","5","6"),3)
sequence <- rep(c("1","2","1","2","1","1"),3)
treatment <- c(rep(c("A"), 6), rep(c("B"), 6),rep(c("C"), 6))
var1 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var1_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var2 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var2_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var3 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var3_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
DF <- data.frame(id,sequence,treatment, var1, var2, var3, var1_base, var2_base, var3_base) %>%
mutate(id = factor(id),
sequence = factor(sequence),
treatment = factor(treatment, levels = c("A","B","C")))
> head(DF)
id sequence treatment var1 var2 var3 var1_base var2_base var3_base
1 1 1 A 0.5488589 1.3045888 0.2367363 1.2646227 1.2241417 0.1968524
2 2 2 A 1.0201801 1.3480361 0.9944096 0.3625067 0.8987885 1.5868442
3 3 1 A 0.7269204 0.7091029 1.2025266 0.1238612 1.8828400 0.8687552
4 4 2 A 3.3240269 3.3133104 3.2251780 2.4116230 2.6284785 2.6027341
5 5 1 A 3.3051822 2.4542786 2.1687379 3.5250026 3.2231797 2.9990167
6 6 1 A 2.7436715 2.7419527 3.8349072 2.9971485 3.0528477 2.6970430
I want to create a linear mixed model with var as the outcome; treatment, var_base (baseline), and sequence as the fixed effect; id as a random effect.
To code it one by one, it would look like this:
lm1 <- lmer(var1 ~ var1_base + treatment + sequence + (1|id), data = DF)
But since I have over 1000 vars, it wouldn't make sense to do it individually. I tried writing for loop but did not turn out to be what I expected.
#Approaches 1--it worked but I want the result to show "var" instead of "[[1]]"
lm_output <- list()
for(i in 4:6){
lm1 <-lmer(DF[[i+3]] ~ DF[[i]] + Treatment+ sequence + (1|id), data = DF)
summary(lm1)
lm_output[[i]] <- summary(lm1)
}
>print(lm_output[1:6])
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
[[4]]
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.8995 0.6129 13.0000 1.468 0.16598
DF[[i]] 0.6772 0.1860 13.0000 3.641 0.00299 **
TreatmentB 0.1621 0.6885 13.0000 0.235 0.81751
TreatmentC -0.3112 0.7049 13.0000 -0.441 0.66611
sequence2 -0.1001 0.5715 13.0000 -0.175 0.86367
[[5]]
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.137752 0.365302 11.104560 0.377 0.713
DF[[i]] 0.729762 0.071874 9.810327 10.153 1.61e-06 ***
TreatmentB 0.531048 0.332585 9.144490 1.597 0.144
TreatmentC 0.060414 0.343280 9.185060 0.176 0.864
sequence2 -0.001702 0.440920 4.000881 -0.004 0.997
[[6]]
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.765739 0.446747 13.000000 1.714 0.110
DF[[i]] 0.783985 0.132198 13.000000 5.930 4.98e-05 ***
TreatmentB 0.006516 0.554550 13.000000 0.012 0.991
TreatmentC -0.312968 0.515562 13.000000 -0.607 0.554
sequence2 -0.762799 0.436095 13.000000 -1.749 0.104
Is there a way to transform [[4]] --> var1, [[5]] --> var2..., so it's more intuitive and easier to manage the data?
#Approaches 2--Tried storing vars name as a vector first and ran. Did not work
responseList <- names(DF)[c(4:6)]
lm_output2 <- list()
for(i in n){
lm2<-lmer(get(n+3) ~ get(n) + Treatment+ sequence + (1|id), data = DF)
summary(lm2)
lm_output2[[i]] <- summary(lm2)
}
> Error in n + 3 : non-numeric argument to binary operator
I understand this error because in this case, n is not numeric so it would fail to do get (n+3). But I don't know how can I specify var and var_base in the same loop.
Any suggestion is appreciated, thank you!