0

I'm trying to write this piece of code using a for loop.

#Took Quiz X and 1
TookQuizX[1,1] <- nrow(Q1[Q1$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[2,1] <- nrow(Q2[Q2$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[3,1] <- nrow(Q3[Q3$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[4,1] <- nrow(Q4[Q4$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[5,1] <- nrow(Q5[Q5$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[6,1] <- nrow(Q6[Q6$anon_user_id %in% Q1$anon_user_id,])

What I tried is the following

for(i in 1:6){
  Qx<-paste("Q",i,"[Q",i,"$anon_user_id",sep="")
  TookQuizX[i,1] <- nrow(Qx %in% Q1$anon_user_id,])
}

When I run my loop I get the following error:

Error: unexpected ']' in:
"  Qx<-paste("Q",i,"[Q",i,"$anon_user_id",sep="")
  TookQuizX[i,1] <- nrow(Qx %in% Q1$anon_user_id,]"
> }
Error: unexpected '}' in "}

What am I doing wrong?

Thanks!


This very simple example hopefully illustrates what i'm trying to do

TookQuizX <- matrix(data=NA,nrow=3,ncol=1)
Q1 <- data.frame(anon_user_id = c("A123", "A111", "A134", "A156"), other_stuf=999)
Q2 <- data.frame(anon_user_id = c("A123", "A234", "A111", "A256", "C521"), other_stuf=999)
Q3 <- data.frame(anon_user_id = c("A123", "A234", "A111", "A356", "B356"), other_stuf=999)

TookQuizX[1,1] <- nrow(Q1[Q1$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[2,1] <- nrow(Q2[Q2$anon_user_id %in% Q1$anon_user_id,])
TookQuizX[3,1] <- nrow(Q3[Q3$anon_user_id %in% Q1$anon_user_id,])
5
  • 3
    Your main mistake is using a for loop. Please make your question reproducibly to enable us to show you alternatives. Commented Nov 21, 2013 at 19:38
  • 1
    Of course the first comment @Roland would be condescension rather than something helpful. Ignacio, i <- 1 and go through your loop line-by-line to see what it is actually doing. Commented Nov 21, 2013 at 19:41
  • 2
    @rawr It's not condescension, but honest advice. I might even have shown a far better alternative, if the OP provided example data. But without data I cannot test ... Commented Nov 21, 2013 at 19:44
  • 2
    Hi, this part does not make sense: Q1$anon_user_id %in% Q1$anon_user_id Also, you are looking for eval(parse(text=...))), BUT I would advise against using that. Instead, use lists. Search through SO, as there are plenty of examples Commented Nov 21, 2013 at 19:55
  • @RicardoSaporta he's collecting the number of rows in each matrix that contain the anon_user_id values found in the first matrix. So the first line looks funny, but it makes sense when applied to the following matrices. Still better as a list of matrices though. Commented Nov 21, 2013 at 20:04

3 Answers 3

3

As with many operations in R, it is easier to wrap your data frames in a list.

Q_all <- list(Q1,Q2,Q3)

First, instead of using nrow, why don't you directly measure how many TRUE values there are in your %in% vector.

TookQuizX[1,1] <- length(which(Q1$anon_user_id %in% Q1$anon_user_id))

To replace your loop, here is an example of lapply:

TookQuizX[,1] <- unlist(lapply(Q_all, function(x) length(which(x$anon_user_id %in% Q_all[[1]]$anon_user_id))))

I assume that in the end, you want TookQuizX to be a matrix where entry i,j is the number of people who took Quiz i and also took Quiz j. Additionally, I assume that your user ID's are unique, and no two rows in the data frame have the same user ID. Then let's extract just the user ID's from your data frames.

anon_user_ids <- lapply(Q_all, `[[`, "anon_user_id")

One way of putting this together (and there are more efficient ways, but this is what came to mind first) would be to Map:

tmp <- Map(function(x,y) length(which(x %in% y)),
  anon_user_ids[rep(seq_along(anon_user_ids),times = length(anon_user_ids))] ,
  anon_user_ids[rep(seq_along(anon_user_ids),each = length(anon_user_ids))] )

This compares the intersection of i and j iteratively, so 1,1, 2,1, 3,1, 1,2, 2,2 and so forth. Now I can put this into a matrix. By default in matrices and arrays in R, vectors are assumed to be in column-major order (the first dimension varies quickest, and the last dimension varies slowest).

TookQuizX <- matrix(unlist(tmp), nrow = length(anon_user_ids))
     # [,1] [,2] [,3]
# [1,]    4    2    2
# [2,]    2    5    3
# [3,]    2    3    5      
Sign up to request clarification or add additional context in comments.

Comments

1

You need to do two things. First, you need to recreate the commands you want to run:

for(i in 1:6){
  Qx <- paste("TookQuizX[1,", i, "] <- nrow(Q", i, "[Q", i,
              "$anon_user_id %in% Q1$anon_user_id,])", sep = "")
  print(Qx)
}

This loop will produce the strings you want to evaluate as code. To do that, you need to tell R to interpret the character strings as actual code. That involves parsing the text into code, and then evaluating the code. Modifying the first loop we get:

for(i in 1:6){
  Qx <- paste("TookQuizX[1,", i, "] <- nrow(Q", i, "[Q", i,
              "$anon_user_id %in% Q1$anon_user_id,])", sep = "")
  eval(parse(text = Qx))
}

2 Comments

Thanks! eval(parse(text = Qx)) is new to me, and it will be useful in the future.
@Ignacio No, don't use it. It's a sign of inadequate understanding of the R language and its use usually indicated badly written code.
0

Here's an example that solves a simplified version of what I think you're trying to accomplish.

x1 = 34
x2 = 65
x3 = 87
x4 = 298
x5 = 384
x6 = 234

var.names = sapply(1:6, function(i){
    paste0("x", i)
})

var.values = sapply(varnames, get)

 #x1  x2  x3  x4  x5  x6 
 #34  65  87 298 384 234 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.