Background
This routine is used in a package that calculates tree (as in Christmas tree) volumes for various species codes (spcd) and geographic regions. The equation forms and coefficients vary by species and region, so I have a dataframe of functions along with their respective species and region that calculate the volume based off of the height (ht) of the tree and diameter (dbh).
Data Setup
Note: In my package, this part is taken care of by other functions, this is just to create a reproducible example (please ignore the sloppyness)
I have a data frame that includes a column of functions, along with some information about "where" to apply those functions in another data frame.
The functions (in reality these are more complex):
func1 <- function(dbh,ht){dbh^2 + ht}
func2 <- function(dbh,ht){dbh^2 - ht}
The data frame (in reality this data frame is much longer):
spcd <- c(122, 122, 141, 141)
region <- c('OR_W', 'OR_E', 'OR_W', 'OR_E')
funcs_df <- data.frame(spcd, region, funcs)
funcs_df$funcs <- c("func1", "func2", "func1", "func2")
Then, I have another frame that has some information, including the spcd and region that should match the values in func_df:
spcd <- c(122, 141, 141, 122, 141, 122)
region <- c('OR_W', 'OR_E', 'OR_W', 'OR_E', 'OR_W', 'OR_W')
dbh <- c(12, 13, 15, 11, 10, 21)
ht <- c(101, 121, 100, 99, 88, 76)
tree_df <- data.frame(spcd, region, dbh, ht)
Applying the Functions
This is the part I would prefer feedback on.
First, I split the tree_df into distinct groups based on spcd and region so I can apply the functions that correspond to these distinct groups.
tree_split <- split(tree_df, list(tree_df$region, tree_df$spcd))
Then, I create an empty data frame to append to.
new_tree <- data.frame()
Next, (and this is where things get messy) I loop through each group, grab the top left cell that acts as a "key" to get the equation from the func_df and use mapply on each group (with some conditionals to handle NA values).
for (group in tree_split) {
# Get the 'group key'
region <- group$region[1]
spcd <- group$spcd[1]
# Get the equation from eqs
eq <- funcs_df$funcs[which((funcs_df$spcd == spcd & funcs_df$region ==
region))]
# Convert func string into actual function
eq <- eq[[1]]
eq <- eval(parse(text=eq))
# Apply the equation to each record in the group
group$cvts <- mapply(eq, group$dbh, group$ht)
# Append to new_tree
new_tree <- rbind(new_tree ,group)
}
Discussion
This results in the desired output with the new cvts outputs according to each function defined in the dataframe:
spcd region dbh ht cvts
4 122 OR_E 11 99 22
1 122 OR_W 12 101 245
6 122 OR_W 21 76 517
2 141 OR_E 13 121 48
3 141 OR_W 15 100 325
5 141 OR_W 10 88 188
I have a few concerns with this approach:
The old adage "if you write a for-loop you are doing it wrong" seems to apply here. Is there some way I could reduce this for-loop to some sort of
applyormapplytype function?Grabbing the key from a cell (see "# Get the 'group key'" comment above) seems sloppy. Is there a way to get this 'group key' in a more formal fashion?
Other advice is, of course, welcome.