0

I am trying to apply a function with two arguments. The first argument is a dataframe, the second is an integer that defines a row of the df.

    col_1 <- c("A", "B", "C")
    col_2 <- c("red", "blue", "black")
    df <- data.frame(col_1, col_2)
    f <- function(x, arg1) {
      x[arg1, 1]
      x[arg1, 2]
    }
    apply(df, 1, f)

Looks like the second argument is not passed to the function. Here is the error

Error in x[arg1, 1] : incorrect number of dimensions

when I put arg1=1 like this

apply(df, arg1=1, f)

it gives me a FUN error

Error in match.fun(FUN) : argument "FUN" is missing, with no default

the desired output is "A" and "red", i.e. in my real code I need to operate with the values of each row.

I also want to add an output variable to be able to save a plot that I am making in my real analysis in a file. Can I just add an "output" variable in function(x, arg1) and then do apply(df, arg1=1, f, output="output_file")?

2
  • Can I ask what precisely is the intended behavior of f()? As it stands, f() itself will return only the value x[arg1, 2]: the value of the final statement in the function (in lieu of a return() statement). Furthermore, this seems to misuse the apply() function. If you want to simply subset rows and columns of your df, subscripting is the way to go: df[1, ] or df[1, 1:2] or df[1, c("col_1", "col_2")]; or more generally df[vector_of_row_indices_or_names, vector_of_column_indices_or_names]. In this Commented Jun 24, 2021 at 19:53
  • this particular function is supposed to return the values of the arg1 row. In my real analysis, the function makes more sense, it builds plot using values of each row. But this is where I have a problem, in getting those values. I am trying to understand why apply() doesn't pass arg1 or doesn't see function. Commented Jun 24, 2021 at 20:06

1 Answer 1

1

As @Greg mentions, the purpose of this code isn't clear. However, the question seems to relate to how apply() works so here goes:

Basically, when any of the apply family of functions is used, the user-enetered function (f(), in this case) is applied to the subset of the data produced by apply. So here, you've asked apply to evaluate each row then call f() - the first argument to f() would then be a vector rather than the data frame your function requires.

Here's some functioning code:

col_1 <- c("A", "B", "C")
col_2 <- c("red", "blue", "black")
df <- data.frame(col_1, col_2)
f <- function(x) {
  x[1]
  x[2]
}
apply(df, 1, f)

This generates all of the values of the second column as a vector since x[2] is returned from the function and for each row, will represent the value in the second column.

If you want the arg1 row of results, you could simply use the following:

find_row <- function(df, row) {
  df[row, ]
}
find_row(df, 1)

apply() isn't required. Using a single function makes the code simpler to read and should be faster too.

Sign up to request clarification or add additional context in comments.

1 Comment

thank you for a thorough explanation. This is exactly what I was looking for

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.