0

I have an input array which contains various domains:

var sites = ["site2.com", "site2.com", "site3.com"];

I need to check, whether certain string domainName matches one of these sites. I used indexOf which worked fine, however problem occured when certain domainName was shown with subpage, e.g. subpage.site1.com. I tried to use some method with RegExp testing instead:

if(sites.some(function(rx) { return rx.test(domainName); })

however the first problem was that I needed to change "" for every element to "\\" to make it work with RegExp:

var sites = [/site1.com/, /site2.com/, /site3.com/];

while I want to keep array with quotation marks for end-user.

Second problem was that it returns true for in cases where compared domainName is not in array, but partially its name contains is part of element in array, for example anothersite1.com with site1.com. It's rare case but happens.

I can modify my input array with RegExp will start and end with ^$ escape characters, but it will complicate it even more, especially that I will need to also add ([a-z0-9]+[.]) to match subpages.

I tried to use replace to change "foo" to \foo\, but I was unable since quation marks defines array elements. Also I tried to use replace with concat to merge string with escape characters to achieve element looking like RegExp formula from site1.com to ^([a-z0-9]+[.])*site1\.com$ but got issues with escaping characters.

Is there a simpler way to achieve this goal?

1
  • 1
    You can specify a word boundary at the beginning of the domain name to account for subdomains, such as \bsite1\.com$ Commented Dec 3, 2022 at 22:19

3 Answers 3

1

I don't think regex is needed here. You could temporarily prefix both the domainName and the site with a dot, and then call endsWith:

const sites = ["site1.com", "site2.com", "site3.com"];

const isValid = domainName => 
    sites.some(site => ("." + domainName).endsWith("." + site));

// demo
console.log(isValid("site1.com")); // true
console.log(isValid("subpage.site1.com")); // true
console.log(isValid("othersite1.com")); // false

Sign up to request clarification or add additional context in comments.

2 Comments

It's so easy workaround solution I didn't even thought it was possible. Thanks. I guess I could use an analogous rule with startsWith to also catch subdirectories. I'm right?
Yes, depending on what you mean with subdirectories (subdomains?).
1

You can build a regex from the sites array and do a single regex test:

const sites = ["site1.com", "site2.com", "site3.com"];
const sitesRegex = new RegExp(
  '\\b(' +
  sites.map(site => site.replace(/([\\\.(){}])/g, '\\$1')).join('|') +
  ')$'
);
console.log('sitesRegex: ' + sitesRegex);
// tests:
['foo.com', 'notsite1.com', 'site1.com', 'www.site1.com'].forEach(site => {
  console.log(site + ' => ' + sitesRegex.test(site));
});

Output:

sitesRegex: /\b(site1\.com|site2\.com|site3\.com)$/
foo.com => false
notsite1.com => false
site1.com => true
www.site1.com => true

Explanation of .replace() regex:

  • ( -- capture group 1 start
  • [\\\.(){}] -- character class with regex chars that need to be escaped
  • ) -- capture group 1 end
  • g flag -- global (match multiple times)

Explanation of resulting /\b(site1\.com|site2\.com|site3\.com)$/ regex:

  • \b -- word boundary
  • ( -- group start
  • site1\.com|site2\.com|site3\.com -- logically ORed and escaped sites
  • ) -- group end
  • $ -- anchor at end of string

2 Comments

Interesting matchin, I've one question regarding '\\$1'. I understand "$" itself, but I don't fully understand "$1", i.e. why when using replace makes that only "$" appears at the end of the expression regex formula, and not at the end of each element (site) separated by OR? if I the correctly, isn't "$1" used to get the first capture group and insert end of line escape $? And the "site" contains only one group because is standalone element of sites? Could you briefly explain it?
The first parameter of .replace() has a regex with a capture group. You reference the captured char in the second parameter as $1. The \\ prefix will escape the captured character. For example the . char will be escaped to \. (notice that in a string literal you write \\ to get \ )
1

You can simply achieve this requirement by using String.includes() along with Array.filter() method.

Live Demo :

var sites = ["site1.com", "site2.com", "site3.com"];

const str = "subpage.site3.com";

const isDomainAvailable = sites.filter(domain => {
    const regEx = new RegExp(`^([a-z0-9]+[.])*${domain}$`);
  return regEx.test(str);
});

console.log(isDomainAvailable.length ? 'matched' : 'not matched');

RegEx explanantion :

^ - It matches the beginning of a line
( - Opening Parentheses for Grouping
[a-z0-9]+ - Matches alphanumeric characters which can be occurred one or more time (+).
[.] - Matches a dot (.)
) - Closing Parentheses for Grouping
* - match zero or more of the preceding character.
${domain} - This is our domain name getting dynamically from the sites array.
$ - It matches the end of a line

4 Comments

You need to test for word boundary, this gives a false positive for www.mysite1.com
@PeterThoeny It is working fine for www.mysite1.com as well. Here is the demo jsfiddle link. jsfiddle.net/fq10e368
Yes, that is the point, read the OP's "Second problem was" paragraph
@PeterThoeny - Thanks for pointing that out. I updated my answer. I hope it will work as per the expectation now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.