Apply function with multiple parameters and output variable

Question

I am trying to apply a function with two arguments. The first argument is a dataframe, the second is an integer that defines a row of the df.

    col_1 <- c("A", "B", "C")
    col_2 <- c("red", "blue", "black")
    df <- data.frame(col_1, col_2)
    f <- function(x, arg1) {
      x[arg1, 1]
      x[arg1, 2]
    }
    apply(df, 1, f)

Looks like the second argument is not passed to the function. Here is the error

Error in x[arg1, 1] : incorrect number of dimensions

when I put arg1=1 like this

apply(df, arg1=1, f)

it gives me a FUN error

Error in match.fun(FUN) : argument "FUN" is missing, with no default

the desired output is "A" and "red", i.e. in my real code I need to operate with the values of each row.

I also want to add an output variable to be able to save a plot that I am making in my real analysis in a file. Can I just add an "output" variable in function(x, arg1) and then do apply(df, arg1=1, f, output="output_file")?

Can I ask what precisely is the intended behavior of f()? As it stands, f() itself will return only the value x[arg1, 2]: the value of the final statement in the function (in lieu of a return() statement). Furthermore, this seems to misuse the apply() function. If you want to simply subset rows and columns of your df, subscripting is the way to go: df[1, ] or df[1, 1:2] or df[1, c("col_1", "col_2")]; or more generally df[vector_of_row_indices_or_names, vector_of_column_indices_or_names]. In this — Greg
– Greg, Commented Jun 24, 2021 at 19:53
this particular function is supposed to return the values of the arg1 row. In my real analysis, the function makes more sense, it builds plot using values of each row. But this is where I have a problem, in getting those values. I am trying to understand why apply() doesn't pass arg1 or doesn't see function. — Yulia Kentieva
– Yulia Kentieva, Commented Jun 24, 2021 at 20:06

Jay Achar · Accepted Answer · 2021-06-24 21:29:18Z

1

As @Greg mentions, the purpose of this code isn't clear. However, the question seems to relate to how apply() works so here goes:

Basically, when any of the apply family of functions is used, the user-enetered function (f(), in this case) is applied to the subset of the data produced by apply. So here, you've asked apply to evaluate each row then call f() - the first argument to f() would then be a vector rather than the data frame your function requires.

Here's some functioning code:

col_1 <- c("A", "B", "C")
col_2 <- c("red", "blue", "black")
df <- data.frame(col_1, col_2)
f <- function(x) {
  x[1]
  x[2]
}
apply(df, 1, f)

This generates all of the values of the second column as a vector since x[2] is returned from the function and for each row, will represent the value in the second column.

If you want the arg1 row of results, you could simply use the following:

find_row <- function(df, row) {
  df[row, ]
}
find_row(df, 1)

apply() isn't required. Using a single function makes the code simpler to read and should be faster too.

answered Jun 24, 2021 at 21:29

Jay Achar

1,33110 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Yulia Kentieva Over a year ago

thank you for a thorough explanation. This is exactly what I was looking for

Collectives™ on Stack Overflow

Apply function with multiple parameters and output variable

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related