0

I'm using a For loop to create 100 datasets according to some specifications. My end goal is to have 1 dataset containing each iterated dataset (i.e., dataset 1 through 100).

My current solution is inelegant. I export each individual data frame (called Dataset) to a csv then merge them outside R. With each iteration i of the For loop, my data frame is overwritten.

Trackfile=1:100
for (i in Trackfile){
  d.cor <- .10 # Desired correlation
  Dataset <- as.data.frame(mvrnorm(20, mu = c(0,0), 
                                   Sigma = matrix(c(1,d.cor,d.cor,1), ncol = 2), 
                                   empirical = TRUE))
  write.csv(Dataset, paste0("C:/",d.cor," ",i,".csv"))
}

I believe the solution is to dynamically name the data frame according to the iteration (i) such that the data frames are named dataset1, dataset2...dataset100, then merge them. But I've struggled to find a solution for dynamically naming data frames embedded in a For loop. I'm a novice in R, please help!

2
  • "I believe the solution is to dynamically name the data frame": No! Please don't! Initialize an empty list of length 100 and place each data set in the list in turn. They can then be combined into one object right in R. How you do that depends on what you mean by "merge" though. Commented Mar 9, 2018 at 17:26
  • Why don't you create a dataframe and keep adding rows to it from the for loop. Alternatively, you can create a list/array/vector of dataframes. Commented Mar 9, 2018 at 17:27

2 Answers 2

1

R handles this easily. Here is an approach, but it may need modification depending on what you want to do with all these random data sets. This builds a list with 100 matrices labeled "data001" to "data100":

library(MASS)
d.cor <- .10
DATA <- replicate(100,  mvrnorm(n=20, mu=c(0, 0), Sigma=matrix(c(1,
     d.cor, d.cor, 1), ncol=2), empirical=TRUE), simplify=FALSE)
names <- paste0("data", sprintf("%0003d", 1:100))
names(DATA) <- names
head(DATA[["data099"]])
#              [,1]         [,2]
#  [1,]  1.94086111  1.570299681
#  [2,] -0.74071651 -0.664948968
#  [3,] -1.02952487 -0.704650191
#  [4,]  0.85203916  0.698703243
#  [5,] -0.08673212  1.668412324
#  [6,]  0.88828524  0.001039757
save(DATA, file="AllData.RData")

This code creates a list containing 100 matrices and names each matrix. You can access a particular matrix with the name or number, DATA[["data099")]] or DATA[[99]]. It is saved as "AllData.RData" so that you can retrieve it with load("AllData.RData"). Depending on what you plan to do with this data, a list is probably more flexible than 100 separate files.

Sign up to request clarification or add additional context in comments.

1 Comment

That worked dcarlson, I really appreciate it thank you! :) And thank you joran and rnso for the comments. rnso, you pointed me in the right direction and I believe I have an alternative solution, posted below. Cheers everyone!
0

Thanks rnso for pointing me towards an alternative solution:

trial=NULL
Trackfile=1:10
for (i in Trackfile){
  d.cor <- .10 # Desired correlation
  Dataset <- as.data.frame(mvrnorm(20, mu = c(0,0), 
                                   Sigma = matrix(c(1,d.cor,d.cor,1), ncol = 2), 
                                   empirical = TRUE))
  trial = rbind(trial, data.frame(Dataset$V1, Dataset$V2))
}
print(trial)
print(Dataset)

Thanks stackoverflow community. I really appreciate it.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.