How to create multiple dataframes using nested for loops

Question

the dataset marks

X <- c("vijay","raj","joy")

Y <- c("maths","eng","science","social","hindi","physical","sanskrit")    

df <- list()

for (i in X){
  for (j in Y)
  {

    df <- data.frame(subset(marks, name == i & subject == j))
  }
}

here I want to create subsets having marks of all subject against each student. Thus we want to have 3 X 7 subsets. But the code I wrote is giving me single subset. How can solve the problem?

Because you are updating the same object in each loop

akrun
– akrun

2018-05-23 07:11:12 +00:00
Commented May 23, 2018 at 7:11 — akrun
– akrun, Commented May 23, 2018 at 7:11
You can use outer() ... or a double lapply()

jogo
– jogo

2018-05-23 07:16:09 +00:00
Commented May 23, 2018 at 7:16 — jogo
– jogo, Commented May 23, 2018 at 7:16

jogo · Accepted Answer · 2018-05-23 11:36:34Z

3

You can use outer() but you have to vectorize the inner function:

X <- c("vijay","raj","joy")
Y <- c("maths","eng","science","social","hindi","physical","sanskrit")
set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
                    subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

sset <- function(x,y) subset(marks, name == x & subject == y)    
L <- outer(X, Y, FUN=Vectorize(sset, SIMPLIFY=FALSE))
L[1,1]

The object L is a matrix of dataframes.
Here is another solution using a double lapply():

L2 <- lapply(X, function(x) lapply(Y, function(y) subset(marks, name == x & subject == y)))

The object L2 is a list of lists.
Here is a variant with for-loops:

df <- vector("list", length(X)*length(Y))
l <- 1

for (i in X)  for (j in Y) {
  df[[l]] <- subset(marks, name == i & subject == j)
  l <- l+1
}

For subsetting only for existing levels you can simply use split()

L3 <- split(marks, list(marks$name, marks$subject))

The objekt L3 is a list of dataframes.

edited May 23, 2018 at 11:36

answered May 23, 2018 at 7:20

jogo

12.6k11 gold badges41 silver badges43 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

ATUL MEHRA Over a year ago

it works, but how can we use for loops in this case.

ATUL MEHRA Over a year ago

hi jogo, if I want apply aggregate function to the columns of dataframes present in the list.

jogo Over a year ago

@ATULMEHRA That is another question. "Aggregating in a list of dataframes". Eventually you can use split() or by()

akrun · Accepted Answer · 2018-05-23 07:12:55Z

3

We could do this with expand.grid to create all the combinations, then loop through the rows of the dataset and subset the 'marks' to get a list of data.frames

dat <- expand.grid(X, Y, stringsAsFactors = FALSE)
lst <- apply(dat, 1, function(x) subset(marks, name == x[1] & subject == x[2]))

Or using tidyverse

library(tidyverse)
crossing(X, Y) %>%
   pmap(~ marks %>%
             filter(name == ..1, subject == ..2))

data

set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
  subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

answered May 23, 2018 at 7:12

akrun

891k38 gold badges590 silver badges700 bronze badges

Collectives™ on Stack Overflow

How to create multiple dataframes using nested for loops

2 Answers 2

3 Comments

data

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

data

Comments

Your Answer

Sign up or log in

Post as a guest

Related