1

I want to create a loop using variable names instead of numbers but I'm struggling with it.

I have over 1000 variables in my data but the structure looks like this:

#Reproducible data
id <- rep(c("1","2","3","4","5","6"),3)
sequence <- rep(c("1","2","1","2","1","1"),3)
treatment <- c(rep(c("A"), 6), rep(c("B"), 6),rep(c("C"), 6))
var1 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var1_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var2 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var2_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var3 <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
var3_base <- c(rnorm(3, 1, 0.4), rnorm(3, 3, 0.5), rnorm(3, 6, 0.8), rnorm(3, 1.1, 0.4), rnorm(3, 0.8, 0.2), rnorm(3, 1, 0.6))
DF <- data.frame(id,sequence,treatment, var1, var2, var3, var1_base, var2_base, var3_base) %>%
  mutate(id = factor(id),
         sequence = factor(sequence),
         treatment = factor(treatment, levels = c("A","B","C")))

> head(DF)
  id sequence treatment      var1      var2      var3 var1_base var2_base var3_base
1  1        1         A 0.5488589 1.3045888 0.2367363 1.2646227 1.2241417 0.1968524
2  2        2         A 1.0201801 1.3480361 0.9944096 0.3625067 0.8987885 1.5868442
3  3        1         A 0.7269204 0.7091029 1.2025266 0.1238612 1.8828400 0.8687552
4  4        2         A 3.3240269 3.3133104 3.2251780 2.4116230 2.6284785 2.6027341
5  5        1         A 3.3051822 2.4542786 2.1687379 3.5250026 3.2231797 2.9990167
6  6        1         A 2.7436715 2.7419527 3.8349072 2.9971485 3.0528477 2.6970430

I want to create a linear mixed model with var as the outcome; treatment, var_base (baseline), and sequence as the fixed effect; id as a random effect.

To code it one by one, it would look like this:

lm1 <- lmer(var1 ~ var1_base + treatment + sequence + (1|id), data = DF)

But since I have over 1000 vars, it wouldn't make sense to do it individually. I tried writing for loop but did not turn out to be what I expected.

#Approaches 1--it worked but I want the result to show "var" instead of "[[1]]"

lm_output <- list()

for(i in 4:6){
  lm1 <-lmer(DF[[i+3]] ~ DF[[i]] + Treatment+  sequence + (1|id), data = DF)
  summary(lm1)
  lm_output[[i]] <- summary(lm1)
}
>print(lm_output[1:6])

[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

[[4]]
Fixed effects:
            Estimate Std. Error      df t value Pr(>|t|)   
(Intercept)   0.8995     0.6129 13.0000   1.468  0.16598   
DF[[i]]       0.6772     0.1860 13.0000   3.641  0.00299 **
TreatmentB    0.1621     0.6885 13.0000   0.235  0.81751   
TreatmentC   -0.3112     0.7049 13.0000  -0.441  0.66611   
sequence2    -0.1001     0.5715 13.0000  -0.175  0.86367   


[[5]]
Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)  0.137752   0.365302 11.104560   0.377    0.713    
DF[[i]]      0.729762   0.071874  9.810327  10.153 1.61e-06 ***
TreatmentB   0.531048   0.332585  9.144490   1.597    0.144    
TreatmentC   0.060414   0.343280  9.185060   0.176    0.864    
sequence2   -0.001702   0.440920  4.000881  -0.004    0.997    


[[6]]
Fixed effects:
                 Estimate Std. Error        df t value Pr(>|t|)    
    (Intercept)  0.765739   0.446747 13.000000   1.714    0.110    
    DF[[i]]      0.783985   0.132198 13.000000   5.930 4.98e-05 ***
    TreatmentB   0.006516   0.554550 13.000000   0.012    0.991    
    TreatmentC  -0.312968   0.515562 13.000000  -0.607    0.554    
    sequence2   -0.762799   0.436095 13.000000  -1.749    0.104    

Is there a way to transform [[4]] --> var1, [[5]] --> var2..., so it's more intuitive and easier to manage the data?

#Approaches 2--Tried storing vars name as a vector first and ran. Did not work

responseList <- names(DF)[c(4:6)]

lm_output2 <- list()

for(i in n){
  lm2<-lmer(get(n+3) ~ get(n) + Treatment+  sequence + (1|id), data = DF)
  summary(lm2)
  lm_output2[[i]] <- summary(lm2)
}

> Error in n + 3 : non-numeric argument to binary operator

I understand this error because in this case, n is not numeric so it would fail to do get (n+3). But I don't know how can I specify var and var_base in the same loop.

Any suggestion is appreciated, thank you!

1 Answer 1

1

You can build the formula for lmer as a string. So we could loop over vars (1, 2, 3, etc.) and concatenate formula from the desired variable names, like this:

library(lme4)
lm_output <- list()
for(i in 1:3) {
  outcome_var = paste("var", i, sep = "")
  base_var = paste(outcome_var, "base", sep = "_")
  form = as.formula(paste(outcome_var,
                          " ~ ",
                          base_var,
                          " + treatment + sequence + (1 | id)",
                          sep = ""))
  lm1 = lmer(form, data = DF)
  summary(lm1)
  lm_output[[i]] <- summary(lm1)
}
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! This is very helpful.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.