5

I'm trying to create a vector of dates (formatted as character strings not as dates) using a for loop. I've reviewed a few other SO questions such as (How to create a vector of character strings using a loop?), but they weren't helpful. I've created the following for loop:

start_dates <- c("1993-12-01")
j <- 1
start_dates <- for(i in 1994:as.numeric(format(Sys.Date(), "%Y"))){
                   date <- sprintf("%s-01-01", i)
                   j <- j + 1
                   start_dates[j] <- date  
               }

However, it returns a NULL (empty) vector start_dates. When I increment the i index manually it works. For example:

> years <- 1994:as.numeric(format(Sys.Date(), "%Y"))
> start_dates <- c("1993-12-01")
> j <- 1
> i <- years[1]
> date <- sprintf("%s-01-01", i)
> j <- j + 1
> start_dates[j] <- date
> start_dates
[1] "1993-12-01" "1994-01-01"
> i <- years[2]
> date <- sprintf("%s-01-01", i)
> j <- j + 1
> start_dates[j] <- date
> start_dates
[1] "1993-12-01" "1994-01-01" "1995-01-01"

It must have something to do with the construction of my for() statement, but I can't figure it out. I'm sure it's super simple. Thanks in advance.

2 Answers 2

10

What is wrong with:

sprintf("%s-01-01", 1994:2015)

> sprintf("%s-01-01", 1994:2015)
 [1] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
 [6] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
[11] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
[16] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
[21] "2014-01-01" "2015-01-01"

sprintf() is fully vectorised, take advantage of this.

Problems with your loop

The main problem is that you are assigning the value of the for() function to start_dates when the for() finished, hence overwriting all the hard work your loop did. This is effectively what is happening:

j <- 1
foo <- for (i in 1:10) {
  j <- j + 1
}
foo

> foo
NULL

And reading ?'for' we see that this behaviour is by design:

Value:

     ....

     ‘for’, ‘while’ and ‘repeat’ return ‘NULL’ invisibly.

Solution: Don't assign the returned value of for(). Hence the template might be:

for(i in foo) {
  # ... do stuff
  start_dates[j] <- bar
}

Fix that and you still have a problem; j will be 2 by the time you assign the first date to the output as you start with j <- 1 and increment it before assigning in the loop.

This would be easier if you made i take values from a sequence 1, 2, ..., n rather than the actual years you want. You can use i to index the years vector and as an index for the elements of start_dates too.

Not that you should do the loop this way, but, if you wanted too...

years <- seq.int(1994, 2015)
start_dates <- numeric(length = length(years))
for (i in seq_along(years)) {
  start_dates[i] <- sprintf("%s-01-01", years[i])
}

which would give:

> start_dates
 [1] "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01"
 [6] "1999-01-01" "2000-01-01" "2001-01-01" "2002-01-01" "2003-01-01"
[11] "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01"
[16] "2009-01-01" "2010-01-01" "2011-01-01" "2012-01-01" "2013-01-01"
[21] "2014-01-01" "2015-01-01"

Sometimes it is helpful to loop over the actual values in a vector (as you did) rather than it's indices (as I just did), but only in specific cases. For general operations like you have here, it is just an additional complication you need to work around. That said, think about doing vectorised operations in R before resorting to a loop.

Sign up to request clarification or add additional context in comments.

2 Comments

Great suggestion. Made it harder than it needed to be. @LyzandeR addresses directly my programming mistake, but yours is a better solution for my particular problem. Thanks.
I've also noticed the reall error; see my edit, but main thing is don't assign the result of the for() call to start_dates, this just wipes out everything you did while the loop was running.
3

You shouldn't assign the loop to a variable. Do:

start_dates <- c("1993-12-01")
j <- 1
for(i in 1994:as.numeric(format(Sys.Date(), "%Y"))){ #use the for-loop on its own. Don't assign it to a variable
  date <- sprintf("%s-01-01", i )
  j <- j + 1
  start_dates[j] <- date  
}

and you are fine:

> start_dates
 [1] "1993-12-01" "1994-01-01" "1995-01-01" "1996-01-01" "1997-01-01" "1998-01-01" "1999-01-01" "2000-01-01" "2001-01-01"
[10] "2002-01-01" "2003-01-01" "2004-01-01" "2005-01-01" "2006-01-01" "2007-01-01" "2008-01-01" "2009-01-01" "2010-01-01"
[19] "2011-01-01" "2012-01-01" "2013-01-01" "2014-01-01" "2015-01-01"

4 Comments

Note j will still be off by 1 because the first iteration will use j = 2 due to incrementing j before the assignment is done. Setting j <- 0 outside the loop or swapping the last two lines of the loop code would rectify this.
@GavinSimpson I think this is done on purpose by the OP because he wants the value c("1993-12-01") to be the first element of his list unless I am mistaken.
I see; well, there is another problem then: never grow objects in an R loop :-) (and yes, I missed the initial assignment, sorry.)
@GavinSimpson That's ok. I wasn't sure about it either. Yours is an excellent and informative answer anyway.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.