1

My data set currently looks like this:

Contract number    FA      NAAR        q    
CM300             9746     47000    0.5010
UL350            80000       0      0.01234
RAD3421          50000     10000    0.9431

I would like to add a column with a randomly generated number (called trial) between 0-1 for each row, and compare this number to the value in column q with another column saying 'l' if q < trial, and 'd' if q > trial.

This is my code which accomplishes this task one time.

trial <- runif(3, min = 0, max = 1)
data2 <- mutate(data, trial)
data2 <- mutate(data, qresult = ifelse(data2$q <= data2$trial, 'l', 'd'))

My struggle is to get this to repeat for several trials , adding new columns onto the table with each repetition. I have tried several types of loops, and looked through several questions and cannot seem to figure it out. I am fairly new to R, so any help would be appreciated!

2
  • How many trials will you need to add? Are you looking to create a summary from all of the trials? Is adding all of those columns necessary, or are you just looking to perform a simulation? Commented Jun 2, 2017 at 15:19
  • Thank you for your feedback. I am trying to perform a simulation. I want to see which rows are marked 'd', and calculate the total FA and total NAAR fields for those rows. I am trying to run many trials in order to see how these values change. Is there a better way to do this? Commented Jun 2, 2017 at 17:08

2 Answers 2

8

You may want to approach this using:

df <- data.frame(contract = c("CM300", "UL350", "RAD3421"), 
                 FA = c(9746, 80000, 50000), 
                 NAAR = c(47000, 0, 10000), 
                 q = c(0.5010, 0.01234, 0.9431))

trialmax <- 10
for(i in 1:trialmax){ 
  trial <- runif(3, min = 0, max = 1)
  df[ , paste0("trial", i)]   <- trial
  df[ , paste0("qresult", i)] <- ifelse(trial >= df$q, "l", "d")
  }

Here I assumed you want 10 trials, but you can change trialmax to whatever you want.

Sign up to request clarification or add additional context in comments.

Comments

0

I'd keep things in a separate matrix for efficiency, only binding them on at the end. In fact, using vector recycling, this can be done very efficiently:

n_trials = 20
trials = matrix(runif(n_trials * nrow(data))], ncol = n_trials)
q_result = matrix(c("l", "d")[(trials > data$q) + 1], ncol = n_trials)
colNames(trials) = paste0("trial", seq_len(n_trials))
colNames(q_result) = paste0("qresult", seq_len(n_trials))
data = cbind(data, trials, q_result)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.