1

I have a df that have three columns and several rows. I am trying to create a list using the below loop but it is failing at the step where i need to populate the list with more than one result. Can somebody point to me where i am going wrong?

Here is my subset of df

       Gene      kaks chr
1 Bra011025 0.5909820 A01
2 Bra011027 0.3684600 A01
3 Bra011028 0.2126320 A01
4 Bra011030 0.0910217 A01
5 Bra011033 0.2412330 A01
6 Bra011034 0.1092790 A01 

And here is my loop

results <- list()
chro <- c("A01", "A02", "A03", "A04", "A05", "A06", "A07", "A08", "A09", "A10")

for(i in 1:10) {
  for(j in c("A01", "A02", "A03", "A04", "A05", "A06", "A07", "A08", "A09", "A10")) {
      simulated.index <- sample(1:nrow(allData),sum(allData$chr==j))
      simulated.kaks <- allData$kaks[simulated.index]
      simulatedNot.kaks <- allData$kaks[-simulated.index]
      results[[j]] <- mean(simulated.kaks)-mean(simulatedNot.kaks)
  } 
}

The output contains only one value..

> head(results)
$A01
[1] 0.003432181

$A02
[1] -0.03501376

$A03
[1] -0.0003581717

$A04
[1] -0.01792963

$A05
[1] -0.01241799

$A06
[1] 0.002551261
7
  • Is results[[j]] <- mean(simulated.kaks)-mean(simulated.kaks) supposed to be results[[j]] <- mean(simulated.kaks)-mean(simulatedNot.kaks)? Commented May 12, 2014 at 23:13
  • i corrected my mistake and i edited my question. Can you help me with this now? Commented May 12, 2014 at 23:21
  • What is the question? It looks like you have a difference of means for each of your j values. Are these results not correct? Commented May 12, 2014 at 23:27
  • my question is i am trying to loop from 1..10 but i am only getting only value. I guess it is over writing each of the result with next one. Commented May 12, 2014 at 23:28
  • I got this error Error in *tmp*[[i]] : subscript out of bounds Commented May 12, 2014 at 23:34

2 Answers 2

1

You can do the simulation for each index j, using the lapply function (and an example with more than one chr value...)

allData = read.table(text="       Gene      kaks chr
1 Bra011025 0.5909820 A01
2 Bra011027 0.3684600 A01
3 Bra011028 0.2126320 A01
4 Bra011030 0.0910217 A02
5 Bra011033 0.2412330 A02
6 Bra011034 0.1092790 A02", header=T)
setNames(lapply(unique(allData$chr), function(j) {
  sapply(1:10, function(i) {
      simulated.index <- sample(1:nrow(allData),sum(allData$chr==j))
      simulated.kaks <- allData$kaks[simulated.index]
      simulatedNot.kaks <- allData$kaks[-simulated.index]
      mean(simulated.kaks)-mean(simulatedNot.kaks)
  })
}), unique(allData$chr))
# $A01
#  [1]  0.15869543 -0.16243990 -0.08979343  0.15869543 -0.07762190 -0.07072610
#  [7] -0.05855457 -0.26258077 -0.15869543 -0.07072610
# 
# $A02
#  [1]  0.15869543 -0.07762190  0.01034743  0.17461143 -0.24351343  0.24351343
#  [7]  0.24351343 -0.17461143  0.05855457  0.17461143
Sign up to request clarification or add additional context in comments.

1 Comment

It worked great...wonderful use of lapply and sapply. Thanks..@josilber
0

I think you have to make two lists.

results_A <- list()
results_B <- list()
for(i in 1:5) {
  for(j in 1:5) {
    results_A[[j]] <- i*j
  }
  results_B[[i]] <- results_A
}

results_B

1 Comment

The OP is running multiple simulations for each j value, so the best output structure is probably a list (indexed by j) containing vectors of simulated differences of means. "List of list of single values" is harder to process.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.