2

the dataset marks

X <- c("vijay","raj","joy")

Y <- c("maths","eng","science","social","hindi","physical","sanskrit")    

df <- list()

for (i in X){
  for (j in Y)
  {

    df <- data.frame(subset(marks, name == i & subject == j))
  }
}

here I want to create subsets having marks of all subject against each student. Thus we want to have 3 X 7 subsets. But the code I wrote is giving me single subset. How can solve the problem?

2
  • 3
    Because you are updating the same object in each loop Commented May 23, 2018 at 7:11
  • You can use outer() ... or a double lapply() Commented May 23, 2018 at 7:16

2 Answers 2

3

You can use outer() but you have to vectorize the inner function:

X <- c("vijay","raj","joy")
Y <- c("maths","eng","science","social","hindi","physical","sanskrit")
set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
                    subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

sset <- function(x,y) subset(marks, name == x & subject == y)    
L <- outer(X, Y, FUN=Vectorize(sset, SIMPLIFY=FALSE))
L[1,1]

The object L is a matrix of dataframes.
Here is another solution using a double lapply():

L2 <- lapply(X, function(x) lapply(Y, function(y) subset(marks, name == x & subject == y)))

The object L2 is a list of lists.
Here is a variant with for-loops:

df <- vector("list", length(X)*length(Y))
l <- 1

for (i in X)  for (j in Y) {
  df[[l]] <- subset(marks, name == i & subject == j)
  l <- l+1
}

For subsetting only for existing levels you can simply use split()

L3 <- split(marks, list(marks$name, marks$subject))

The objekt L3 is a list of dataframes.

Sign up to request clarification or add additional context in comments.

3 Comments

it works, but how can we use for loops in this case.
hi jogo, if I want apply aggregate function to the columns of dataframes present in the list.
@ATULMEHRA That is another question. "Aggregating in a list of dataframes". Eventually you can use split() or by()
3

We could do this with expand.grid to create all the combinations, then loop through the rows of the dataset and subset the 'marks' to get a list of data.frames

dat <- expand.grid(X, Y, stringsAsFactors = FALSE)
lst <- apply(dat, 1, function(x) subset(marks, name == x[1] & subject == x[2]))

Or using tidyverse

library(tidyverse)
crossing(X, Y) %>%
   pmap(~ marks %>%
             filter(name == ..1, subject == ..2))

data

set.seed(24)
marks <- data.frame(name = sample(X, 100, replace = TRUE), 
  subject = sample(Y, 100, replace = TRUE), stringsAsFactors = FALSE)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.