1

If I have a vector of strings:

dd <- c("sflxgrbfg_sprd_2011","sflxgrbfg_sprd2_2011","sflxgrbfg_sprd_2012")

and want to find the entires with '2011' in the string I can use

ifiles <- dd[grep("2011",dd)]

How do I search for entries with a combination of strings included, without using a loop?

For example, I would like to find the entries with both '2011' and 'sprd' in the string, which in this case will only return

sflxgrbfg_sprd_2011

How can this be done? I could define a variable

toMatch <- c('2011','sprd)

and then loop through the entries but I was hoping there was a better solution?

Note: To make this useful for different strings. Is it also possible to to determine which entries have these strings without them being in the order shown. For example, 'sflxlgrbfg_2011_sprd'

2
  • Do you patterns other than those specified in the example? Commented Jun 10, 2015 at 16:08
  • Yes, sorry I should have been more specific. I've added a note. I basically mean any number of strings... Commented Jun 10, 2015 at 16:09

3 Answers 3

3

If you want to find more than one pattern, try indexing with a logical value rather than the number. That way you can create an "and" condition, where only the string with both patterns will be extracted.

ifiles <- dd[grepl("2011",dd) & grepl("sprd_",dd)]
Sign up to request clarification or add additional context in comments.

Comments

2

Try

  grep('2011_sprd|sprd_2011', dd, value=TRUE)
 #[1] "sflxgrbfg_sprd_2011"  "sflxlgrbfg_2011_sprd"

Or using an example with more patterns

 grep('(?<=sprd_).*(?=2011)|(?<=2011_).*(?=sprd)', dd1,
             value=TRUE, perl=TRUE)
 #[1] "sflxgrbfg_sprd_2011"       "sflxlgrbfg_2011_sprd"     
 #[3] "sfxl_2011_14334_sprd"      "sprd_124334xsff_2011_1423"

data

dd <- c("sflxgrbfg_sprd_2011","sflxgrbfg_sprd2_2011","sflxgrbfg_sprd_2012", 
"sflxlgrbfg_2011_sprd")

dd1 <- c(dd,  "sfxl_2011_14334_sprd", "sprd_124334xsff_2011_1423")

Comments

0

If you want a scalable solution, you can use lapply, Reduce and intersect to:

  1. For each expression in toMatch, find the indices of all matches in dd.
  2. Keep only those indices that are found for all expressions in toMatch.
dd <- c("sflxgrbfg_sprd_2011","sflxgrbfg_sprd2_2011","sflxgrbfg_sprd_2012")
dd <- c(dd, "sflxgrbfh_sprd_2011")
toMatch <- c('bfg', '2011','sprd')

dd[Reduce(intersect, lapply(toMatch, grep, dd))]
#> [1] "sflxgrbfg_sprd_2011"  "sflxgrbfg_sprd2_2011"

Created on 2018-03-07 by the reprex package (v0.2.0).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.