0

How could I do something like this but in an optimal (vectorized) way in R?

N=10000

f <- 1.005
S0 <- 100
p <- 1/10

n <- seq(3,N)
S <- c(f*S0, f^2*S0, f^3*S0)
P <- c(0, 0, p*(f-1)*f^2*S0)
for(i in n){
  R <- tail(S,1)-tail(P,1)
  S <- c(S, f*R)
  P <- c(P, p*(f-1)*R)
}

the final desired output being of course S and P (all the way up to row N+1). This computes a sequential time series row by row (each row is a function of the previous row values, above row 3).

I tried to use lapply but it's difficult to get a function to return two changes in the global environment... (and the resulting table is also badly formatted)

2
  • Could you describe in words what your code is doing? Commented Jun 17, 2020 at 14:25
  • @GregorThomas done Commented Jun 17, 2020 at 14:28

1 Answer 1

2

The simplest step to speed up your code is to pre-allocate the vectors. Start S and P at their final lengths, rather than "growing" them each iteration of the loop. This results in a more than 100x speed-up of your code:

N <- 10000
f <- 1.005
S0 <- 100
p <- 1/10

original = function(N, f, S0, p) {
  n <- seq(3,N)
  S <- c(f*S0, f^2*S0, f^3*S0)
  P <- c(0, 0, p*(f-1)*f^2*S0)
  for(i in n){
    R <- tail(S,1)-tail(P,1)
    S <- c(S, f*R)
    P <- c(P, p*(f-1)*R)
  }
  return(list(S, P))
}

pre_allocated = function(N, f, S0, p) {
  n <- seq(3,N)
  S <- c(f*S0, f^2*S0, f^3*S0, rep(NA, N - 3))
  P <- c(0, 0, p*(f-1)*f^2*S0, rep(NA, N - 3))
  for(i in n){
    R <- S[i] - P[i]
    S[i + 1] <- f*R
    P[i + 1] <- p*(f-1)*R
  }
  return(list(S, P))
}


## Check that we get the same result
identical(original(N, f, S0, p), pre_allocated(N, f, S0, p))
# [1] TRUE

## See how fast it is
microbenchmark::microbenchmark(original(N, f, S0, p), pre_allocated(N, f, S0, p), times = 10)
# Unit: milliseconds
#                        expr      min       lq      mean    median       uq      max neval
#       original(N, f, S0, p) 414.3610 419.9241 441.26030 426.01610 454.6002 538.0523    10
#  pre_allocated(N, f, S0, p)   2.3306   2.6478   2.92908   3.05785   3.1198   3.2885    10

It's possible that a vectorized solution, perhaps using a function like cumprod, would be even faster, but I don't see a clear way to do it. If you can write out your result mathematically as a cumulative sum or product, that would make it clearer and possibly reveal a solution.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. I'm afraid I doubt the general formula for nth term exists, though I have asked: math.stackexchange.com/questions/3723391/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.