2

I'd like to analyse time-series data along multiple rolling windows. As a first step, I'd like to extract the start and end-times of said windows. Sadly, using zoo:rollapply seems to be missing a crucial parameter, namely the overlap, which seems to fixed at 1?

Example:

library(zoo)
a <- c(0:10)
output <- rollapply(a, 4, print)

output:

      [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    1    2    3    4
[3,]    2    3    4    5
[4,]    3    4    5    6
[5,]    4    5    6    7
[6,]    5    6    7    8
[7,]    6    7    8    9
[8,]    7    8    9   10

desired output (having an overlap of 2, for example):

     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    2    3    4    5
[3,]    4    5    6    7
[4,]    6    7    8    9
[5,]    8    9    10    

any idea of how to get there?

3
  • row 5 column 4 is then NaN? Commented Mar 12 at 15:11
  • You can use by = 2. Or is that not what you expect? Commented Mar 12 at 15:13
  • by = 2 sounds very good! Thanks! I did overlook it... I'll try later Commented Mar 12 at 15:17

1 Answer 1

3

A combination of by=2 for 2 steps per row and adding some NA's at the end makes it:

> rollapply(c(v, NA, NA), width=4, FUN=`c`, by=2)
     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    2    3    4    5
[3,]    4    5    6    7
[4,]    6    7    8    9
[5,]    8    9   10   NA

(Given that v <- 0:10 ).

I thought fill=NA would make it, but it does not.

c(v, NA, NA) the number of NA should be the number of by, eventually.

If so, use the function:

fill_up <- function(vec, by, default=NA) c(vec, rep(default, by))

rolling_window <- function(vec, width, by, FUN=`c`, default=NA) {
  rollapply(fill_up(vec, by, default=default), width=width, by=by, FUN=FUN)
}

And call it by:

rolling_window(vec=v, width=4, by=2)

Value-Window

If you want to have instead of the width a value window, then the output will be a list of vectors (because not any more a constant number of elements in one group/"row".

I would write a new for that.

require(zoo)

v <- c(100, 200, 300, 500, 600, 900, 1000, 1200, 1300, 1500)

rolling_absolute_window <- function(vec, window_width=300) {
  start_value <- vec[[1]]
  end_value <- vec[[length(vec)]]
  limits <- seq(from=start_value, to=end_value, by=window_width)
  limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
  lapply(as.data.frame(t(limits_df)), function(pair) {
    vec[pair[1] <= vec & vec <= pair[2]]
  })
}

rolling_absolute_window(v)

## this would return:

$V1
[1] 100 200 300

$V2
[1] 500 600

$V3
[1]  900 1000

$V4
[1] 1000 1200 1300

$V5
[1] 1300 1500

rolling_absolute_window_excluding_last <- function(vec, window_width=300) {
  start_value <- vec[[1]]
  end_value <- vec[[length(vec)]]
  limits <- seq(from=start_value, to=end_value, by=window_width)
  limits_df <- rollapply(c(limits, Inf), width=2, FUN=`c`, by=1)
  lapply(as.data.frame(t(limits_df)), function(pair) {
    vec[pair[1] <= vec & vec < pair[2]]   # excludes last
  })
}

rolling_absolute_window_excluding_last(v)

## this returns:

$V1
[1] 100 200 300

$V2
[1] 500 600

$V3
[1] 900

$V4
[1] 1000 1200

$V5
[1] 1300 1500


rolling_window <- function(vec, window_width=300) {
  result <- list()
  for (i in 1:length(vec)) {
    result[[i]] <- vec[ vec[i] <= vec & vec <= vec[i] + window_width]
  }
  result[-length(vec)]
}

rolling_window(v)

## this would return:

[[1]]
[1] 100 200 300

[[2]]
[1] 200 300 500

[[3]]
[1] 300 500 600

[[4]]
[1] 500 600

[[5]]
[1] 600 900

[[6]]
[1]  900 1000 1200

[[7]]
[1] 1000 1200 1300

[[8]]
[1] 1200 1300 1500

[[9]]
[1] 1300 1500



rolling_window_excluding_last <- function(vec, window_width=300) {
  result <- list()
  for (i in 1:length(vec)) {
    result[[i]] <- vec[ vec[i] <= vec & vec < vec[i] + window_width]
  }
  result[-length(vec)]
}

rolling_window_excluding_last(v)

## which returns:

[[1]]
[1] 100 200 300

[[2]]
[1] 200 300

[[3]]
[1] 300 500

[[4]]
[1] 500 600

[[5]]
[1] 600

[[6]]
[1]  900 1000

[[7]]
[1] 1000 1200

[[8]]
[1] 1200 1300

[[9]]
[1] 1300 1500

Sign up to request clarification or add additional context in comments.

10 Comments

is it generalizable? I'm confused by v, NA, NA
I guess number of NA shoul dbe the number of by?
works great! But I am wondering, what if the values in the vector that I am cutting rolling windows from mean something? To clarify: let's say a <- c(100, 200, 300, 500, 600, 900, 1000, 1200, 1300, 1500) and I'd like to make the rolling window 300 wide, i.e. widthwould not be steps in the vector, but a fixed distance between values, that the result would be (100, 200, 300), (200,500), (500, 600), (600,900), (900, 1200)
In that case, you can't use rollapply - it is only for even widths. You need a recursive function then which executes step by step.
@Beres, You need to define more precisely what 300 wide means but if we assume that for each i = 1, 2, 3, ... we want the x values that satisfy a[i]-300 <= x <= a[i] then w <- seq_along(a) - findInterval(a - 300, a); rollapplyr(a, w, toString) . There are examples of this in ?rollapply
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.