I have a function that given an input vector returns a data.frame or data.table; the number of columns and the names of the columns depend on the input. I want to add these columns to an existing data.table using one of the columns of the data.table as input for the function. What is the easiest/cleanest way of doing this in a data.table?
# Example function; in this case the number of columns the function
# returns is fixed, but in practice the number of columns and the
# names of the columns depend on x
my_function <- function(x) {
name <- deparse1(substitute(x))
res <- data.table(x == 1, x == 2)
names(res) <- paste0(name, "==", 1:2)
res
}
# Example data set
dta <- data.table(a = sample(1:10, 10, replace = TRUE), b = letters[1:10])
I can create new columns using this function:
> dta[, my_function(a)]
a==1 a==2
1: FALSE FALSE
2: FALSE FALSE
3: FALSE FALSE
4: FALSE FALSE
5: FALSE FALSE
6: TRUE FALSE
7: FALSE FALSE
8: TRUE FALSE
9: FALSE TRUE
10: TRUE FALSE
However, I also want to keep existing columns. The following does what I want, but I expect there is a simpler/better solution. I also expect that the cbind will introduce a copy of the data which is another reason I want to avoid this as the data sets are quite large.
> dta <- cbind(dta, dta[, my_function(a)])
> dta
a b a==1 a==2
1: 1 a TRUE FALSE
2: 8 b FALSE FALSE
3: 2 c FALSE TRUE
4: 4 d FALSE FALSE
5: 10 e FALSE FALSE
6: 4 f FALSE FALSE
7: 8 g FALSE FALSE
8: 10 h FALSE FALSE
9: 8 i FALSE FALSE
10: 4 j FALSE FALSE
cbind. You could dodta[, (LETTERS[2:3]) := my_function(a)]if you know number of columns that would be returned beforehand but in your case unfortunately you don't.my_function <- function(x) { name <- deparse1(substitute(x)) res <- data.table(x = x, x == 1, x == 2) names(res)[2:3] <- paste0(name, "==", 1:2) names(res)[1] <- paste0(name) res }a, but in practice this is one column of a large set of columns (I will change the example).dta <- cbind(dta, dta[...])introduces a copy of my data. I know the syntax of the example you show, but in that case, as you mention, you have to know the number of columns, and also fix the names of columns.tmp <- dta[,my_function(a)]2.cols <- paste0('cols', seq_along(tmp))3.dta[, (cols) := tmp]Not sure if it qualifies as an answer.