So, here is my sample data:
library(data.table)
mydata <- fread(
"sample,neg1,neg2,neg3,gen1,gen2
sample1, 0, 1, 2, 30, 60
sample2, 1, 0, 1, 15, 30
sample3, 2, 1, 0, 10, 20
")
and in each row I want to subtract background (mean of "neg" columns). My current code is the following:
negatives <- names(mydata)[grep("^neg", names(mydata))] # "neg1" "neg2" "neg3"
mydata[, names(mydata)[-1]:={
bg <- mean(unlist(.SD[, negatives, with=F]));
.SD - as.integer(bg);
}, with=F, by=sample]
# mydata
# sample neg1 neg2 neg3 gen1 gen2
#1: sample1 -1 0 1 29 59
#2: sample2 1 0 1 15 30
#3: sample3 1 0 -1 9 19
it does the job, but works quite slow on my real bigger table - I assume, it's because of using .SD. Is there better way to do this task? using set somehow?
(this question is very similar to my previous one: the source data is in another form here, so I could not find the way to apply the same solution with set, hope it will not be considered a duplicate).
mydata1 <- mydata[ , V1:=list(as.integer(rowMeans(.SD))), .SDcols=indx]; mydata1[, names(mydata1)[-c(1,7)]:= .SD-mydata1[['V1']], .SDcols=2:6][,V1:=NULL][]rowMeanson the selected columns separately and then usesetto update all the columns. I updated the solutionmeltfromreshape2orgatherfromtidyr), after which the problem becomes trivial?