1

I'm trying to verify that the emails for a list are correct. I was thinking I could do a partial string match between Email and Name columns, and return a logical vector (TRUE/FALSE) in a new column.

In the example below, only rows 3 and 5 have correct emails, and the output would be 'TRUE' for these rows. I tried the following, and it hasn't worked:

>for (i in Test$LastName) {
 Test$Match <- agrepl(i, Test$Email, ignore.case = TRUE)
}

>Test$Email %in% Test$LastName

Any other suggestions are welcome too. Thanks!

enter image description here

3
  • I think the grepl function could be helpful Commented Mar 25, 2020 at 20:48
  • 1
    There are some nice answers but I'll just add that the reason your code doesn't give you the expected results is because when you pass Test$Email to the agrepl function, you're passing ALL the email addresses from your data frame. Commented Mar 25, 2020 at 22:12
  • You're the best, Ryan! Yes, the answers made me realize that. Thanks so much for helping me understand it better. Commented Mar 25, 2020 at 22:29

3 Answers 3

2

Try this:

DF <- data.frame(FirstName = c("Audrey","Tammy","Stacey","Judson","Kellie"),
                 LastName = c("Low","Rose","Lock","Porter","Sims"),
                 Email = c("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"))
library(dplyr)

DF %>% 
  rowwise() %>%
  mutate(isMatch = grepl(LastName, Email, ignore.case = T))

Output:

  FirstName LastName Email                    isMatch    
  <fct>     <fct>    <fct>                    <lgl>
1 Audrey    Low      [email protected]         FALSE
2 Tammy     Rose     [email protected]          FALSE
3 Stacey    Lock     [email protected]     TRUE 
4 Judson    Porter   [email protected] FALSE
5 Kellie    Sims     [email protected]         TRUE 
Sign up to request clarification or add additional context in comments.

1 Comment

Love the simple solution. Thank you!
2

A base R option is to use grepl + mapply

Test <- within(Test, Match <- mapply(grepl,paste(FirstNmae,LastName,sep = "|"),Email,ignore.case = TRUE))

such that

> Test
  FirstNmae LastName                    Email Match
1    Audrey      Low         [email protected] FALSE
2     Tammy     Rose          [email protected] FALSE
3    Stacey     Lock     [email protected]  TRUE
4    Judson   Porter [email protected] FALSE
5    Kellie     Sims         [email protected]  TRUE

DATA

Test <- data.frame(FirstNmae = c("Audrey","Tammy","Stacey","Judson","Kellie"),
                 LastName = c("Low","Rose","Lock","Porter","Sims"),
                 Email = c("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"))

Comments

1

Try something like this? You are almost there, just need to store the TRUE/FALSE in a vector. I used sapply, iterate through the rownames and compare the corresponding columns. In sapply, results are stored in a vector so you can use it as a TRUE/FALSE:

test = data.frame(FirstName=c("Audrey","Tammy","Stacey","Judson","Kellie"),
LastName=c("Low","Rose","Lock","Porter","Sims"),
Email=c("[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"))

matches = sapply(1:nrow(test),function(i)agrepl(test$LastName[i],test$Email[i]))

test[matches,]

  FirstName LastName                Email
3    Stacey     Lock [email protected]
5    Kellie     Sims     [email protected]

1 Comment

Thanks for helping me understand this! This reminded me that I need to store each output in its own home using [i] :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.