1

I wanna loop through a sequence of letters 'ABCDEFGHIJK', but the loop in R loops over 1 value at a time. Is there a way to loop over 3 values at a time? In this case the sequence 'ABCDEFGHIJK' would be looped as 'ABC' then 'DEF' and so on.

I've tried to change the length of the function but I still didn't find a way, I can do this in python but I didn't find any information about it in R nor in the help option of R.

xp <-'ACTGCT'
for(i in 1:length(xp)){
  if(i == 'ACG'){
    print('T')
  }
}
1
  • You are looping through a vector of length 1 and comparing the sequence with string Commented Jul 22, 2019 at 13:44

4 Answers 4

2

We can use the vectorized substring, i.e.

substring('ABCDEFGHIJK', seq(1, nchar('ABCDEFGHIJK') - 1, 3), seq(3, nchar('ABCDEFGHIJK'), 3)) == 'ACG'
#[1] FALSE FALSE FALSE FALSE

NOTE: This will only extract 3-characters. So If at the end you are left with 2 characters, it will not return them. For the above example, it outputs:

substring('ABCDEFGHIJK', seq(1, nchar('ABCDEFGHIJK') - 1, 3), seq(3, nchar('ABCDEFGHIJK'), 3))
#[1] "ABC" "DEF" "GHI" ""
Sign up to request clarification or add additional context in comments.

Comments

2

An option would be to split the string over each 3 characters and then do the comparison

lapply(strsplit(v1, "(?<=.{3})", perl = TRUE), function(x) x== 'ACG')
#[[1]]
#[1] FALSE FALSE FALSE FALSE

data

v1 <- 'ABCDEFGHIJK'

Comments

2

Here is a stringr solution that outputs a list for whether or not there are matches:

library(stringr)

# Split string into sequences of 3 (or fewer if length is not multiple of 3)
split_strings <- str_extract_all("ABCDEFGHIJK", ".{1,3}", simplify = T)[1,]

# The strings you want to loop through / search for
x <- c("ABC", "DEF", "GHI", "LMN")

# Output is named list
sapply(x, `%in%`, split_strings, simplify = F)

$ABC
[1] TRUE

$DEF
[1] TRUE

$GHI
[1] TRUE

$LMN
[1] FALSE

Or, if you only want to look for one element:

"ABC" %in% split_strings
[1] TRUE

Comments

0

1) Base R Iterate over the sequence 1, 4, 7, ... and use substr to extract the 3 character portion of the input string starting at that position number. Then perform whatever processing that is desired. If there are fewer than 3 characters in the last chunk it will use whatever is available for that chunk. This is a particularly good approach if you want to exit early since a break can be inserted into the loop.

for(i in seq(1, nchar(xp), 3)) {
  s <- substr(xp, i, i+2)
  print(s) # replace with desired processing
}
## [1] "ACT"
## [1] "GCT"

1a) lapply We translate the loop to lapply or sapply if one iteration does not depend on another.

process <- function(i) { 
  s <- substr(xp, i, i+2)
  s  # replace with desired processing
}
sapply(seq(1, nchar(xp), 3), process)
## [1] "ACT" "GCT"

2) rollapply Another possibility is to break the string up into single characters and then iterate over those passing a 3 element vector of single characters to the indicated function. Here we have used toString to process each chunk but that can be replaced with any other suitable function.

library(zoo)
rollapply(strsplit(xp, "")[[1]], 3, by = 3, toString, align = "left", partial = TRUE)
## [1] "A, C, T" "G, C, T"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.