For loop vectorized

Question

How could I do something like this but in an optimal (vectorized) way in R?

N=10000

f <- 1.005
S0 <- 100
p <- 1/10

n <- seq(3,N)
S <- c(f*S0, f^2*S0, f^3*S0)
P <- c(0, 0, p*(f-1)*f^2*S0)
for(i in n){
  R <- tail(S,1)-tail(P,1)
  S <- c(S, f*R)
  P <- c(P, p*(f-1)*R)
}

the final desired output being of course S and P (all the way up to row N+1). This computes a sequential time series row by row (each row is a function of the previous row values, above row 3).

I tried to use lapply but it's difficult to get a function to return two changes in the global environment... (and the resulting table is also badly formatted)

Could you describe in words what your code is doing?

Gregor Thomas
– Gregor Thomas

2020-06-17 14:25:41 +00:00
Commented Jun 17, 2020 at 14:25 — Gregor Thomas
– Gregor Thomas, Commented Jun 17, 2020 at 14:25
@GregorThomas done

Mobeus Zoom
– Mobeus Zoom

2020-06-17 14:28:14 +00:00
Commented Jun 17, 2020 at 14:28 — Mobeus Zoom
– Mobeus Zoom, Commented Jun 17, 2020 at 14:28

Gregor Thomas · Accepted Answer · 2020-06-17 14:37:22Z

2

The simplest step to speed up your code is to pre-allocate the vectors. Start S and P at their final lengths, rather than "growing" them each iteration of the loop. This results in a more than 100x speed-up of your code:

N <- 10000
f <- 1.005
S0 <- 100
p <- 1/10

original = function(N, f, S0, p) {
  n <- seq(3,N)
  S <- c(f*S0, f^2*S0, f^3*S0)
  P <- c(0, 0, p*(f-1)*f^2*S0)
  for(i in n){
    R <- tail(S,1)-tail(P,1)
    S <- c(S, f*R)
    P <- c(P, p*(f-1)*R)
  }
  return(list(S, P))
}

pre_allocated = function(N, f, S0, p) {
  n <- seq(3,N)
  S <- c(f*S0, f^2*S0, f^3*S0, rep(NA, N - 3))
  P <- c(0, 0, p*(f-1)*f^2*S0, rep(NA, N - 3))
  for(i in n){
    R <- S[i] - P[i]
    S[i + 1] <- f*R
    P[i + 1] <- p*(f-1)*R
  }
  return(list(S, P))
}


## Check that we get the same result
identical(original(N, f, S0, p), pre_allocated(N, f, S0, p))
# [1] TRUE

## See how fast it is
microbenchmark::microbenchmark(original(N, f, S0, p), pre_allocated(N, f, S0, p), times = 10)
# Unit: milliseconds
#                        expr      min       lq      mean    median       uq      max neval
#       original(N, f, S0, p) 414.3610 419.9241 441.26030 426.01610 454.6002 538.0523    10
#  pre_allocated(N, f, S0, p)   2.3306   2.6478   2.92908   3.05785   3.1198   3.2885    10

It's possible that a vectorized solution, perhaps using a function like cumprod, would be even faster, but I don't see a clear way to do it. If you can write out your result mathematically as a cumulative sum or product, that would make it clearer and possibly reveal a solution.

answered Jun 17, 2020 at 14:37

Gregor Thomas

147k22 gold badges185 silver badges320 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Mobeus Zoom Over a year ago

Thank you. I'm afraid I doubt the general formula for nth term exists, though I have asked: math.stackexchange.com/questions/3723391/…

Collectives™ on Stack Overflow

For loop vectorized

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related