1

I am brand new to R so if I'm thinking about this completely wrong feel free to tell me. I have a series of imported dataframes on power plants, one of each year (Plant1987, Plant1988 etc...) that I am trying to combine ultimately into one data frame. Prior to doing so, I'd like to add a "year" variable to each dataframe. I could do this for each individual dataframe, but would like to formalize it and do it in one step. I know how to do it in stata, but I'm struggling here.

I was thinking something along the lines of:

for (y in 1987:2008) {
     paste("Plant",y,sep="")$year <- y
}

which doesn't work because paste is obviously not the right function. Is there a smart, quick way to do this? Thanks

2
  • 1
    Store your data.frames in a list; don't create a bunch of variables that have data in their names. Then you can apply functions over that list. paste() creates character vectors. Character vectors are not the same things as names/symbols. Commented May 26, 2017 at 19:34
  • @MrFlick can you help me think a little more about what such a function would look like. I understand the concept of applying a function to the dataframes within a list, but I don't understand how I can reference part of the dataframe name to make a new variable Commented May 26, 2017 at 19:40

3 Answers 3

2

Try this ..

year=seq(1987,2008,by=1)
list_object_names = sprintf("Plant%s", 1987:2008)

list_DataFrame = lapply(list_object_names, get)

for (i in 1:length(list_DataFrame ) ){
    list_DataFrame[[i]][,'Year']=year[i]
}
Sign up to request clarification or add additional context in comments.

2 Comments

Nice, I like it.
This is great thank you. My friend gave me a solution too, which I'll post in a bit.
0

Here are some codes to give you some ideas. I used the mtcars data frame as an example to create a list with three data frames. After that I used two solutions to add the year (2000 to 2002) to each data frame. You will need to modify the codes for your data.

# Load the mtcars data frame
data(mtcars)

# Create a list with three data frames
ex_list <- list(mtcars, mtcars, mtcars)

# Create a list with three years: 2000 to 2002
year_list <- 2000:2002

Solution 1: Use lapply from base R

ex_list2 <- lapply(1:3, function(i) {

  dt <- ex_list[[i]]

  dt[["Year"]] <- year_list[[i]]

  return(dt)
})

Solution 2: Use map2 from purrr

library(purrr)    

ex_list3 <- map2(ex_list, year_list, .f = function(dt, year){

  dt$Year <- year

  return(dt)
})

ex_list2 and ex_list3 are the final output.

Comments

0

Let's say you have data.frames

Plant1987 <- data.frame(plantID=1:4, x=rnorm(4))
Plant1988 <- data.frame(plantID=1:4, x=rnorm(4))
Plant1989 <- data.frame(plantID=1:4, x=rnorm(4))

You could put a $year column in each with

year <- 1987:1989
for(yeari in year) {
  eval(parse(text=paste0("Plant",yeari,"$year<-",yeari)))
}

Plant1987
#   plantID           x year
# 1       1  0.67724230 1987
# 2       2 -1.74773250 1987
# 3       3  0.67982621 1987
# 4       4  0.04731677 1987
# ...etc for other years...

...and either bind them together into one data.frame with

df <- Plant1987
for(yeari in year[-1]) {
  df <- rbind(df, eval(parse(text=paste0("Plant",yeari))))
}

df
#    plantID            x year
# 1        1  0.677242300 1987
# 2        2 -1.747732498 1987
# 3        3  0.679826213 1987
# 4        4  0.047316768 1987
# 5        1  1.043299473 1988
# 6        2  0.003758675 1988
# 7        3  0.601255190 1988
# 8        4  0.904374498 1988
# 9        1  0.082030356 1989
# 10       2 -1.409670456 1989
# 11       3 -0.064881722 1989
# 12       4  1.312507736 1989

...or in a list as

itsalist <- list()
for(yeari in year) {
  eval(parse(text=paste0("itsalist$Plant",yeari,"<-Plant",yeari)))
}

itsalist
# $Plant1987
#   plantID           x year
# 1       1  0.67724230 1987
# 2       2 -1.74773250 1987
# 3       3  0.67982621 1987
# 4       4  0.04731677 1987
# 
# $Plant1988
#   plantID           x year
# 1       1 1.043299473 1988
# 2       2 0.003758675 1988
# 3       3 0.601255190 1988
# 4       4 0.904374498 1988
# 
# $Plant1989
#   plantID           x year
# 1       1  0.08203036 1989
# 2       2 -1.40967046 1989
# 3       3 -0.06488172 1989
# 4       4  1.31250774 1989

2 Comments

That's not quite what I was suggesting. If the data were in a list like xx<-list(Plant1987, Plant1988, Plant1989 ), then you could do Map(function(d,y) {d$year<-y; d}, xx, 1987:1989). Best to avoid eval-parse
@MrFlick Thanks for the tip and the reference! Always learning.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.