I read all the 'string to variable name' posts but none of them covered my particular problem. I have a list of vectors (DNA sequence data) made using the seqinr package 'read.fasta'. I have a data frame of variants and their location and I want to convert the list vector elements at the locations specified in the data frame to their alternate values. On an individual basis this can be done using
list$name[number] <- alternate.character
# I tried
for (i in 1:length(df$CHROM))
if (is.na(df$Call[i])) {next} else {get(paste("test$",df$CHROM[i],"[",df$POS[i],"]",sep="")) <- df$Call[i]}
# example data
test <- list("One" = c("a","t","a","g","c"),
"Two" = c("g","a","t","t","a","c","a"))
df <- data.frame("CHROM"=c(rep("One",2),rep("Two",3)),
"POS" = c(2,4,1,3,6),
"REF" = c("t","g","g","t","c"),
"ALT" = c("a","a","t","g","t"),
"Call" = c("T","A","G",NA,"T"))
But 'get' returns the vector element from the list and doesn't allow me to assign it as the variant in the parent list.
What I want is the list to go from
$One
[1] "a" "t" "a" "g" "c"
$Two
[1] "g" "a" "t" "t" "a" "c" "a"
to
$One
[1] "a" "T" "a" "A" "c"
$Two
[1] "G" "a" "t" "t" "a" "T" "a"
For the test data this isn't a problem because you can just do it individually, but the real data is over 10,000 sequences and over 100,000 variants. Bonus points if you can vectorize it, I don't have enough experience nesting apply functions to get it to work with information from a list and a data frame at the same time.
sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] seqinr_3.0-7
loaded via a namespace (and not attached):
[1] tools_3.0.2
for(i in seq_len(nrow(df))) {if(!is.na(as.character(df$Call[i]))) test[[as.character(df$CHROM[i])]][as.numeric(as.character(df$POS[i]))] <- as.character(df$Call[i])};test