I'm trying to write a function in R that takes a list of dataframes, performs operations on various columns, and returns a dataframe of the results (each column is named as the dataframe used). A simplistic example is as follows:
df1 <- data.frame(
a = c(1, 2, 3),
b = c(2, 3, 4),
c = c(5, 6, 7))
df2 <- data.frame(
a = c(9, 8, 7),
b = c(5, 1, 1),
c = c(6, 6, 7))
myfunct <- function(listofdfs){
results_df <- data.frame(rownames = 'meanA', 'maxA', 'maxB', 'sum')
for (i in 1:length(listofdfs)) {
mymean <- mean(listofdfs[i]$a)
mymaxA <- max(listofdfs[i]$a)
mymaxB <- max(listofdfs[i]$b)
mysum <- mymean + mymaxA
newcol <- c(mymean, mymaxA, mymaxB, mysum)
results_df[, ncol(results_df) + 1] <- newcol
colnames(results_df)[ncol(results_df)] <- listofdfs[i]
}
results_df
}
Where calling
myfunct(list(df1, df2))
would give this output:
| df1 | df2 | |
|---|---|---|
| meanA | 2 | 8 |
| maxA | 3 | 9 |
| maxB | 4 | 5 |
| sum | 5 | 17 |
I get errors every time I try to make it work, and specifically right now I'm getting an error saying that replacement has 4 rows and data has 1.
Is there a better way to build this type of function than with a for loop? The real function I'm building is more complex than just taking the mean, max, and sum of a few digits, but this dummy should get the point across.