1

Is it possible to create and assign a name to an object "by reference"? For example, I have a large data.frame and I need to do some basic operations to some of the columns in it. I put the columns, grouping and operations I need to do in lists:

exec_group_list = c("nbhd", "state", "use")
exec_var_list   = c("land", "imp", "assmt", "landp", "impp", "assmtp")
exec_func_list  = c("sum", "mean", "median", "max", "min", "sd")

So, the "land" column, will be grouped by "nbhd" and then the "sum", "mean", "median", etc will be applied to it. Then the same will be done to the "imp" column and so on. Then I will repeat the same but this time the grouping will be done by "state"... rinse, lathe and repeat, as follows:

for (eachg in exec_group_list){
  group_by_field = eachg
  group_by = eval(parse(text=paste("sales$",group_by_field)))
  group_by_lst = list(group_by)
  print(paste("Grouping by:", eachg))
  #CREATE DATA.FRAME FOR GROUP HERE
    for (eachv in exec_var_list){
      var = eval(parse(text=paste("sales$",eachv)))
      print(paste("On column:", eachv))
      for (eachf in exec_func_list){
    print(paste("Calculating:", eachf))
    tempt = (aggregate(var, group_by_lst, eachf))
    colnames(tempt) = c(eachg, paste(eachv,".",eachf, sep=""))
    print(tempt)
    #APPEND COLUMNS TO GROUP DATA.FRAME
      }
    }
  }

I figured out how to use references from a list using eval() so I can loop thru the grouping list and the column list and do the same operations using the values in the list.

But I'd like to store the info in a data.frame named after the grouping field. So for example, if I am grouping by "nbhd" I'd like to create an empty data.frame named "by_nbhd".

I tried something similar to eval(parse(text=paste("by_","nbhd", sep=""))) = data.frame("nbhd"=NA) but I get an error.

Anyone knows if this is possible? Any help will be appreciated. Thank you in advance.

4
  • 1
    Use package data.table instead. Commented Sep 15, 2013 at 15:50
  • Use map or mapply that takes multiple arguments. Commented Sep 15, 2013 at 16:12
  • Thanks. I'll look those up. I managed to create data.frames using assign() as shown here link Commented Sep 15, 2013 at 16:22
  • 2
    You are really going down the wrong path for R. Instead of creating multiple named dataframes, you should instead be creating one dataframe with multiple columns. Describe your file structure and get help using read.table. Commented Sep 15, 2013 at 16:50

1 Answer 1

1

Rather than asking for "creating an object by reference" which brings up all sorts of extraneous cognitive associations with the distinction between "calling by value" versus "calling by reference", you should be asking for help on "computing on/with the language". Presumably you have a dataset (which you have not described very well) with a set of columns named" "nbhd","state", and "use", and also columns named: "land", "imp", "assmt", "landp", "impp", "assmtp". You want to serial examine summary statistics of 6 sorts within 6 categories of the first group on the numeric columns of the second group (3 x 6 x 6 tables).

Write a prototype of a function that delivers one summary table for a particular function, a particular numeric column, and a particular categorical column.

 tabfn <- function(dfrm, numcol, catcol, fn){
                         tapply(dfrm[[numcol]], dfrm[[catcol]], fn) }

It's easiest to create a list of first class functions rather than eval(parsing(text=character-objects)

exec_func_list  = list(sum, mean, median, max, min, sd)
for (eachg in exec_group_list){
  print(paste("Grouping by:", eachg))
  for (eachv in exec_var_list){
     print(paste("On column:", eachv))
     for (eachfn in exec_func_list){
       print(paste("Calculating:", eachf))
       print(tabfn(dfrm, exec_var_list, exec_group_list, eachfn)
                              }
                               }
                                   }

Unfortunately this is mostly untested guesswork since you have not produces a minimal reproducible example.

Sign up to request clarification or add additional context in comments.

2 Comments

I know it's been a while. I used your suggestion and worked great! calcDF <- function(DFsource,VARname,BYname,FUNname){ pVARname = eval(parse(text=paste(DFsource,"$",VARname,sep=""))) pBYname = eval(parse(text=paste(DFsource,"$",BYname, sep=""))) calcDF = data.frame(aggregate(pVARname,list(pBYname),FUNname)) return(calcDF)} Thanks for the help.
I would have avoided: eval(parse(text=paste(DFsource,"$",VARname,sep=""))). That's what dfrm[[VARname]] was supposed to do. Much safer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.