1

When I use this with atomic vector, it works:

x = data.frame(myvar = 1:10)

test_atom = function(var, maxvalue = max(var)) { 
    return(maxvalue)
}

test_atom(x$myvar)
# [1] 10

But when I try to evaluate a column in a data frame, there is a problem:

test_df = function(data, var, maxvalue = max(var)) { 
    params = as.list(match.call()[-1])
    data$var = eval(params$var, data)
    return(maxvalue)
}

test_df(x, myvar)
# Error in test_df(x, myvar) : object 'myvar' not found

Note however that the following works ok, so evaluation seems fine:

test_df2 = function(data, var, maxvalue = max(var)) { 
    params = as.list(match.call()[-1])
    data$var = eval(params$var, data)
    return(data$var)
}

test_df2(x, myvar)
# [1]  1  2  3  4  5  6  7  8  9 10

How to properly evaluate the argument so that it detects the maximum value of x$myvar?

EDIT To properly spell out my intention, I want the possibility to set the value for maxvalue manually, but in case I leave it blank, it should set itself to the maximum value of myvar. This could be achieved by a conditional statement within the function to check whether the argument is NULL and then setting the maxvalue to maximum value of myvar within the function - but I wanted to do it in a simpler way, i.e. not within the function.

Example - both should be possible:

test_df(x, myvar, 5) # I set the value of the last argument manually
test_df(x, myvar) # I leave it blank - and it sets itself to the max value of `myvar`
2
  • Try test_df(x, x$myvar). This return '10'. Commented Jan 13, 2016 at 8:34
  • @symbolrush of course. But I want this function to accept data frame as its first argument and then individual variables without using the $ function. Such format is customary in many R functions, such as subset(x, myvar > 5) Commented Jan 13, 2016 at 8:40

1 Answer 1

1

You can't use a variable with $ subseting, you have to tackle it a little differently.

test_df3 = function(data, var, maxvalue = max(data[,var])) { 
  return(maxvalue)
}

You should pass data (which is your data frame) and subset it by the var value to max, or it won't be able to guess what you're trying to get the max of.

In case var is a column name, it has to be a string or R will try (at the call time) to find a variable/object with this name to give its value to the function.

This gives:

> test_df3(x,'myvar')
[1] 10
> test_df3(x,'myvar',5)
[1] 5
Sign up to request clarification or add additional context in comments.

4 Comments

Interesting. Actually, my intention (which I will add to my question) was that this would allow the user to either set the value manually, or, if she leaves the argument blank, it would set itself to max value of myvar. I know I could write some conditional statement within the function to check whether the argument is NULL and if yes, set it to max value of myvar, but I thought there would be a simpler way.
@jakub I'll wait your update on the question, add some examples calls with expected output so it's clear what you wish to achieve.
well, I could do test_df5 = function(data, var, maxvalue = max(data[, deparse(substitute(var))])) {return(maxvalue)} which would allow me to use unquoted variable name. But it looks ugly... isn't there a better way?
@jakub I don't think, and the deparse trick prevent you of using a numeric index like test_df3(x,1) for example.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.