0

I have a logic app which is triggered by emails in an inbox. It is all working, except for some emails are getting through when I don't want them. Or rather an email signature with an image description of [email protected] is getting through. I think it might be my regex that is allowing it, but I am not very good with regex.

Here is the code I have so far:

var reg = /([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+)/gi;
var emailData = " \n\n'[email protected]'\n\n \n\n  DevOps\n[cid:[email protected]]\n\n ";
    
//Matching email signatures
var matches = emailData .match(reg);

console.log(matches);

I need the regex to return a list of any email addresses, but they need to be fully formed. Unlike the one mentioned above which is missing the .com (or .org etc).

3
  • Your regex returns ["[email protected]", "[email protected]"], so what is the problem? Commented Jun 22, 2021 at 13:28
  • 1
    based on which condition do you want to say [email protected] is not a valid email? maybe no numbers behind the last . or restrict amount of characters after last . ? Commented Jun 22, 2021 at 13:31
  • I dont want that second string, as its not an email address. I only want what looks like a valid email address. Commented Jun 22, 2021 at 15:25

2 Answers 2

2

Your regex (allowing everything which has an @ and a .)

const regex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+)/gm;
const str = `[email protected]
[email protected]
[email protected]
[email protected]`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    console.log(m[0]);
}

#1 No numbers allowed after last .

const regex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z_-]+)/gm;
const str = `[email protected]
[email protected]
[email protected]
[email protected]`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    console.log(m[0]);
}

#2 Restrict characters after last . to be min 2 and max 7 characters {2,7}$

const regex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]{2,7}$)/gm;
const str = `[email protected]
[email protected]
[email protected]
[email protected]`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    console.log(m[0]);
}

#3 Define a list of possible top-level domain names like (com|org|info)

const regex = /([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.(com|org|info)$)/gm;
const str = `[email protected]
[email protected]
[email protected]
[email protected]`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    console.log(m[0])
}

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for the solutions! I have one problem with it, and I think its the input. I am passing in a string from a logic app step that strips out the HTML from the email, and it looks like this : var email = "[email protected]>\nSubject: FW:\n\n \n\n \n\n'[email protected]'\n\n \n\n Email Signature | Dev Team\n[cid:[email protected]]\n\n"; When I run that through my script, it comes out with no matches.
I got it. I just used replace to take out any characters I want to ignore.
@evolmonster the problem was most probably the $ at the end. The $ makes sure to be an end. Use this regex tool wo practice and get explanations: regex101.com/r/HCM4y7/1
0

have a look at - https://ihateregex.io/expr/email/

Hate to break it to you but email match via Regex are hard if not impossible.

one way would be matching things with domain name TDN endings ( you can create a group of all tdn and match or just limit the end part of regex - modified regex and filter it out from there onwards.

1 Comment

This particular regex is part of an azure logic app. I read the email from the body of emails sent to a certain inbox. I figured regex/javascript would be the way to go. But if you are saying a regex wont catch everything, maybe I need another solution?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.