3

What is the best practice to delete columns programmatically in data.table?

The following works:

DT[, c("a", "b") := NULL]

But when trying to do this using a variable that stores column names

cols.to.del <- c("a", "b")
DT[, cols.to.del := NULL]

it fails badly as cols.to.del is not evaluated in the correct environment.

0

1 Answer 1

7

We can wrap it inside the brackets, and then assign (:=) to 'NULL' (preferred way)

DT[, (cols.to.del) := NULL]

Or another option (in case we don't want to wrap it with brackets) would be to loop over the 'cols.to.del' in a for loop and assign to NULL

for(j in seq_along(cols.to.del)){
    DT[, cols.to.del[j] := NULL]
}

Or for subsetting the columns, we can use setdiff along with with=FALSE.

DT[, setdiff(names(DT), cols.to.del), with=FALSE]
Sign up to request clarification or add additional context in comments.

3 Comments

The third version is definitely my favorite. Thanks!
@paljenczy note that the third version doesn't remove the columns from DT; rather the output of that command is a new data.table which you must assign (by copying) to DT, which is likely inefficient.
@MichaelChirico noted, thanks!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.