3

updated with example: I have a function as follows:

myfun <- function(DT, var){  
  for(i in 1:length(var)){
    s = substitute(!(is.na(x) | is.nan(x)), list(x=as.symbol(eval(var[i]))))
    DT = DT[eval(s)]
  }
  return(DT)
}

input:

> dt = data.table(id=c(1,2,3,4,5), x=c(1,2,NA,4,5), y=c(1,NA,3,4,NA))
> dt
   id  x  y
1:  1  1  1
2:  2  2 NA
3:  3 NA  3
4:  4  4  4
5:  5  5 NA

runs:

> myfun(dt, var=c("x", "y"))
   id x y
1:  1 1 1
2:  4 4 4
> myfun(dt, var=c("x"))
   id x  y
1:  1 1  1
2:  2 2 NA
3:  4 4  4
4:  5 5 NA

var is an character array of some variables in DT. The goal is to only obtain rows in DT which do not have any NA or NaN wrt to any of variables in var.

I do not WANT the for loop. I want to construct a query s with all the conditions and then evaluate that query for DT. for the first case I want:

s = !(is.na(x) | is.nan(x) | is.na(y) | is.nan(y))

and for the second case I want:

s = !(is.na(x) | is.nan(x))

How can I construct a dynamic query s and just run it once as an i/where query in the data table.

More generally how can I create dynamic expression based on input. Using expression(paste()) did not help me. Then I can use substitute.

5
  • Maybe give us example input and its associated expected output? Commented Sep 4, 2013 at 13:56
  • @Frank: Updated with an example. Commented Sep 4, 2013 at 14:07
  • Never mind. I found it in: stackoverflow.com/questions/11677424/… str=paste0("is.na(",var,") |", " is.nan(",var,")", collapse="|") s = parse(text=paste("!(",str,")")) dt[eval(s)] Commented Sep 4, 2013 at 14:27
  • You should probably move your answer to the "Your Answer" box below. Answering your own question is actually encouraged: stackoverflow.com/help/self-answer Commented Sep 4, 2013 at 14:34
  • OK. I have added it. need to understand more about parse. Commented Sep 4, 2013 at 14:41

1 Answer 1

3

Ans:

var = c("x","y")
str=paste0("is.na(",var,") |", " is.nan(",var,")", collapse="|")
s = parse(text=paste("!(",str,")"))
DT[eval(s)]

source: How to use an unknown number of key columns in a data.table

Sign up to request clarification or add additional context in comments.

4 Comments

+1 This works as long as "s" isn't a column name. If "s" might ever be a column name in future, use ".s" instead of "s", or build the entire query (including the "DT[" bit) and eval that as a whole.
@MatthewDowle - Thanks! That is a very important point! Do you have any pointers about the detailed structure of an expression in R. There is some mention of it in the Writing R Extensions doc, but any detailed info will be appreciated.
NP. Not sure what you mean by detailed structure. .Internal(inspect(expression(1+2+3))) ?
In 5.11 Evaluating R expressions from C (cran.r-project.org/doc/manuals/…), it says when constructing an expression: "There are three steps: the call is constructed as a pairlist of length 3, the list is filled in, and the expression represented by the pairlist is evaluated." The example then shows the list being filled it. It is kind of vague for beginners, but the .Internal with inspect gives me some hope :) Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.