2

I am new to the data.table. I just know how to replace value column by column. Is there any way to do it by 1 command? Here my sample code:

DT1 = data.table(A=sample(3, 10, TRUE), 
             B=sample(3, 10, TRUE),
             C=sample(3, 10, TRUE))

DT1[,A:=ifelse(A>1,1,0),]
DT1[,B:=ifelse(B>1,1,0),]
DT1[,C:=ifelse(C>1,1,0),]

Ideally there is a way to merge the last 3 command into 1. Thanks in advance.

1
  • With dplyr, you could do: library(dplyr); DT1 <- mutate_each(DT1, funs(as.integer(. > 1L))) Commented Feb 4, 2015 at 17:09

2 Answers 2

2

The most efficient (and idiomatic) way, is to use set() along with a for-loop here. set() is a low overhead version of := designed to handle repetitive cases like this.

for (cols in c("A", "B", "C")) {
    set(DT1, i=which(DT1[[cols]] > 1L), j=cols, value=0L)
}

Note that @ColonelBeauvel's solution returns an entirely new data set just to replace some rows for those columns, which is what data.table tries to avoid!

Sign up to request clarification or add additional context in comments.

2 Comments

the original command uses ifelse to set two value 0/1. if I use set, is it possible to use 1 set?
Yes, but with 2 set(). However it doesn't lose efficiency. First assign the entire column with the first value, then check condition and assign the 2nd value where the condition satisfies.. It's the condition check that's the bottle neck.. Assigning the same value to the entire column is incredibly fast. Side note: anything will be faster than ifelse()...
2

Like this:

DT1[,lapply(.SD, function(u) ifelse(u>1,1,0))]

1 Comment

You may want to consider the marginally more efficient DT1[,lapply(.SD, function(x) as.integer(x>1) ].

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.