0

I'm trying to make a variable, Var, that takes the value 0 60% of the time, and 1 otherwise, with 50 000 observation. For a normally distributed, I remember doing the following for a normal distribution, to define n:

Var <- rnorm(50 000, 0, 1)

Is there a way I could combine an ifelse command with the above to specify the number of n as well as the probability of Var being 0?

3 Answers 3

2

I would use rbinom like this:

n_ <- 50000
p_ <- 0.4 # it's probability of 1s

Var <- rbinom(n=n_, size=1, prob=p_)

By using of variables, you can change the size and/or probability just by changing of those variables. Hope that's what you are looking for.

Sign up to request clarification or add additional context in comments.

Comments

1

If by 60% you mean a probability equal to 0.6 (rather than an empirical frequency), then

Var <- sample(0:1, 50000, prob = c(6, 4), replace = TRUE)

gives a desired sequence of independent Bernoulli(0.6) realizations.

Comments

0

I'm picking nits here, but it actually isn't completely clear exactly what you want.

Do you want to simulate a sample of 50000 from the distribution you describe?

Or, do you want 50000 replications of simulating an observation from the distribution you describe?

These are different things that, in my opinion, should be approached differently.

To simulate a sample of size 50000 from that distribution you would use:

sample(c(0,1), size = 50000, replace = TRUE)

To replicate 50000 simulations of sampling from the distribution you describe I would recommend:

replicate(50000, sample(c(0,1), size = 1, prob = c(0.6, 0.4)))

This might seem silly since these two lines of code produce exactly the same thing, in this case.

But suppose your goal was to investigate properties of samples of size 50000? Then what you would use a bunch (say, 1000) of replication of that first line of code above wrapped inside replicate:

replicate(1000, sample(c(0,1), size = 50000, prob = c(0.6, 0.4), replace = TRUE))

I hope I haven't been too pedantic about this. Having seen simulations go awry it has become my belief that one should keep separate the thing being simulated from the number of simulations you decide to do. The former is fundamental to your problem, while the latter only affects the accuracy of the simulation study and how long it takes.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.