1

What i should do should be pretty easy, yet, the newcomer that I am, I spent way too much time on trying to achieve this. With this script I try to filter out ALL observations from a data frame that contain ANY of the mentioned patterns.

The script is:

df1 <- filter_at(df, vars(contains("Pair")), 
                 any_vars(str_detect(., pattern="quinoaquinoa|lupinelupine", negate=TRUE)))

I do not get any error when I run this, however nothing changes and the expressions are not taken out from the dataframe. As i understand these functions i could also place a ! in front of str_detect instead of the negate=TRUE, however neither works.

Note, the data frame is actually larger (has columns other than those containing "Pair", and the patterns to filter out will always be different and are retrieved from another data frame.

The data frame looks like:

str(df)

'data.frame':   653 obs. of  6 variables:
 $ Pair_1: Factor w/ 7 levels "grasscloverleycamelina",..: 3 7 7 3 3 3 7 6 6 6 ...
 $ Pair_2: Factor w/ 20 levels "camelinacamelina",..: 10 6 6 8 8 10 6 8 8 10 ...
 $ Pair_3: Factor w/ 20 levels "camelinacamelina",..: 19 20 20 20 19 19 20 20 20 16 ...
 $ Pair_4: Factor w/ 23 levels "camelinacamelina",..: 9 8 8 8 9 9 4 1 1 5 ...
 $ Pair_5: Factor w/ 20 levels "camelinacamelina",..: 9 12 16 16 13 13 12 12 11 11 ...
 $ Pair_6: Factor w/ 20 levels "camelinacamelina",..: 20 13 9 17 20 20 5 7 8 8 ...

dput dataframe:

structure(list(Pair_1 = structure(c(3L, 7L, 7L, 3L, 3L, 3L), .Label = c("grasscloverleycamelina", 
"grasscloverleyquinoa", "lupinecamelina", "lupinegrasscloverley", 
"lupinelupine", "lupinequinoa", "lupinespringcereal"), class = "factor"), 
    Pair_2 = structure(c(10L, 6L, 6L, 8L, 8L, 10L), .Label = c("camelinacamelina", 
    "camelinagrasscloverley", "camelinalupine", "camelinaquinoa", 
    "camelinaspringcereal", "grasscloverleycamelina", "grasscloverleygrasscloverley", 
    "grasscloverleylupine", "grasscloverleyquinoa", "grasscloverleyspringcereal", 
    "quinoacamelina", "quinoagrasscloverley", "quinoalupine", 
    "quinoaquinoa", "quinoaspringcereal", "springcerealcamelina", 
    "springcerealgrasscloverley", "springcereallupine", "springcerealquinoa", 
    "springcerealspringcereal"), class = "factor"), Pair_3 = structure(c(19L, 
    20L, 20L, 20L, 19L, 19L), .Label = c("camelinacamelina", 
    "camelinagrasscloverley", "camelinalupine", "camelinaquinoa", 
    "camelinaspringcereal", "grasscloverleycamelina", "grasscloverleygrasscloverley", 
    "grasscloverleylupine", "grasscloverleyquinoa", "grasscloverleyspringcereal", 
    "quinoacamelina", "quinoagrasscloverley", "quinoalupine", 
    "quinoaquinoa", "quinoaspringcereal", "springcerealcamelina", 
    "springcerealgrasscloverley", "springcereallupine", "springcerealquinoa", 
    "springcerealspringcereal"), class = "factor"), Pair_4 = structure(c(9L, 
    8L, 8L, 8L, 9L, 9L), .Label = c("camelinacamelina", "camelinagrasscloverley", 
    "camelinalupine", "camelinaquinoa", "camelinaspringcereal", 
    "grasscloverleycamelina", "grasscloverleygrasscloverley", 
    "grasscloverleyquinoa", "grasscloverleyspringcereal", "lupinecamelina", 
    "lupinegrasscloverley", "lupinelupine", "lupinequinoa", "lupinespringcereal", 
    "quinoacamelina", "quinoagrasscloverley", "quinoaquinoa", 
    "quinoaspringcereal", "springcerealcamelina", "springcerealgrasscloverley", 
    "springcereallupine", "springcerealquinoa", "springcerealspringcereal"
    ), class = "factor"), Pair_5 = structure(c(9L, 12L, 16L, 
    16L, 13L, 13L), .Label = c("camelinacamelina", "camelinagrasscloverley", 
    "camelinaquinoa", "camelinaspringcereal", "grasscloverleycamelina", 
    "grasscloverleygrasscloverley", "grasscloverleyquinoa", "grasscloverleyspringcereal", 
    "lupinecamelina", "lupinegrasscloverley", "lupinequinoa", 
    "lupinespringcereal", "quinoacamelina", "quinoagrasscloverley", 
    "quinoaquinoa", "quinoaspringcereal", "springcerealcamelina", 
    "springcerealgrasscloverley", "springcerealquinoa", "springcerealspringcereal"
    ), class = "factor"), Pair_6 = structure(c(20L, 13L, 9L, 
    17L, 20L, 20L), .Label = c("camelinacamelina", "camelinagrasscloverley", 
    "camelinaquinoa", "camelinaspringcereal", "grasscloverleycamelina", 
    "grasscloverleygrasscloverley", "grasscloverleyquinoa", "grasscloverleyspringcereal", 
    "lupinecamelina", "lupinegrasscloverley", "lupinequinoa", 
    "lupinespringcereal", "quinoacamelina", "quinoagrasscloverley", 
    "quinoaquinoa", "quinoaspringcereal", "springcerealcamelina", 
    "springcerealgrasscloverley", "springcerealquinoa", "springcerealspringcereal"
    ), class = "factor")), row.names = c(NA, 6L), class = "data.frame")
1
  • 1
    Please also format inline code for readability, I helped you this time. Commented Mar 9, 2019 at 9:43

2 Answers 2

2

You can loop over column which has "Pair" in the dataframe check if the required pattern in present or not, create a matrix of logical vectors and select rows which have no occurrence of the pattern.

cols <- grep("Pair", names(df))
df[rowSums(sapply(df[cols],function(x) grepl("quinoaquinoa|lupinelupine", x)))== 0, ]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, it works! Would you mind explaining a bit how this one works?
@BellaLin Using grep we first find out our columns which has "Pair" in it. Then we loop over those columns using sapply and check which of them have the required pattern in it. Just run sapply(df[cols],function(x) grepl("quinoaquinoa|lupinelupine", x)) and you'll get a matrix with TRUE/FALSE value in it indicating whether the pattern is present in it or not. Now we do rowSums and select only those rows which has no occurrence of the pattern at all in their row. (== 0).
2

There is no string containing "quinoaquinoa" or "lupinelupine" in your dataframe. I think the pattern you're using is inccorect. This works : filter_at(df, vars(contains("Pair")), any_vars(str_detect(., pattern = "quinoa|lupine")))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.