Modify array to make it usable with RegExp

Question

I have an input array which contains various domains:

var sites = ["site2.com", "site2.com", "site3.com"];

I need to check, whether certain string domainName matches one of these sites. I used indexOf which worked fine, however problem occured when certain domainName was shown with subpage, e.g. subpage.site1.com. I tried to use some method with RegExp testing instead:

if(sites.some(function(rx) { return rx.test(domainName); })

however the first problem was that I needed to change "" for every element to "\\" to make it work with RegExp:

var sites = [/site1.com/, /site2.com/, /site3.com/];

while I want to keep array with quotation marks for end-user.

Second problem was that it returns true for in cases where compared domainName is not in array, but partially its name contains is part of element in array, for example anothersite1.com with site1.com. It's rare case but happens.

I can modify my input array with RegExp will start and end with ^$ escape characters, but it will complicate it even more, especially that I will need to also add ([a-z0-9]+[.]) to match subpages.

I tried to use replace to change "foo" to \foo\, but I was unable since quation marks defines array elements. Also I tried to use replace with concat to merge string with escape characters to achieve element looking like RegExp formula from site1.com to ^([a-z0-9]+[.])*site1\.com$ but got issues with escaping characters.

Is there a simpler way to achieve this goal?

You can specify a word boundary at the beginning of the domain name to account for subdomains, such as \bsite1\.com$ — Peter Thoeny
– Peter Thoeny, Commented Dec 3, 2022 at 22:19

trincot · Accepted Answer · 2022-12-03 22:29:50Z

1

I don't think regex is needed here. You could temporarily prefix both the domainName and the site with a dot, and then call endsWith:

const sites = ["site1.com", "site2.com", "site3.com"];

const isValid = domainName => 
    sites.some(site => ("." + domainName).endsWith("." + site));

// demo
console.log(isValid("site1.com")); // true
console.log(isValid("subpage.site1.com")); // true
console.log(isValid("othersite1.com")); // false

answered Dec 3, 2022 at 22:29

trincot

357k38 gold badges282 silver badges338 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Atex Over a year ago

It's so easy workaround solution I didn't even thought it was possible. Thanks. I guess I could use an analogous rule with startsWith to also catch subdirectories. I'm right?

trincot Over a year ago

Yes, depending on what you mean with subdirectories (subdomains?).

Peter Thoeny · Accepted Answer · 2022-12-03 22:34:13Z

1

You can build a regex from the sites array and do a single regex test:

const sites = ["site1.com", "site2.com", "site3.com"];
const sitesRegex = new RegExp(
  '\\b(' +
  sites.map(site => site.replace(/([\\\.(){}])/g, '\\$1')).join('|') +
  ')$'
);
console.log('sitesRegex: ' + sitesRegex);
// tests:
['foo.com', 'notsite1.com', 'site1.com', 'www.site1.com'].forEach(site => {
  console.log(site + ' => ' + sitesRegex.test(site));
});

Output:

sitesRegex: /\b(site1\.com|site2\.com|site3\.com)$/
foo.com => false
notsite1.com => false
site1.com => true
www.site1.com => true

Explanation of .replace() regex:

( -- capture group 1 start
[\\\.(){}] -- character class with regex chars that need to be escaped
) -- capture group 1 end
g flag -- global (match multiple times)

Explanation of resulting /\b(site1\.com|site2\.com|site3\.com)$/ regex:

\b -- word boundary
( -- group start
site1\.com|site2\.com|site3\.com -- logically ORed and escaped sites
) -- group end
$ -- anchor at end of string

edited Dec 3, 2022 at 22:34

answered Dec 3, 2022 at 22:28

Peter Thoeny

7,6381 gold badge16 silver badges24 bronze badges

2 Comments

Atex Over a year ago

Interesting matchin, I've one question regarding '\\$1'. I understand "$" itself, but I don't fully understand "$1", i.e. why when using replace makes that only "$" appears at the end of the expression regex formula, and not at the end of each element (site) separated by OR? if I the correctly, isn't "$1" used to get the first capture group and insert end of line escape $? And the "site" contains only one group because is standalone element of sites? Could you briefly explain it?

Peter Thoeny Over a year ago

The first parameter of .replace() has a regex with a capture group. You reference the captured char in the second parameter as $1. The \\ prefix will escape the captured character. For example the . char will be escaped to \. (notice that in a string literal you write \\ to get \ )

Rohìt Jíndal · Accepted Answer · 2022-12-07 07:17:07Z

1

You can simply achieve this requirement by using String.includes() along with Array.filter() method.

Live Demo :

var sites = ["site1.com", "site2.com", "site3.com"];

const str = "subpage.site3.com";

const isDomainAvailable = sites.filter(domain => {
    const regEx = new RegExp(`^([a-z0-9]+[.])*${domain}$`);
  return regEx.test(str);
});

console.log(isDomainAvailable.length ? 'matched' : 'not matched');

RegEx explanantion :

^ - It matches the beginning of a line
( - Opening Parentheses for Grouping
[a-z0-9]+ - Matches alphanumeric characters which can be occurred one or more time (+).
[.] - Matches a dot (.)
) - Closing Parentheses for Grouping
* - match zero or more of the preceding character.
${domain} - This is our domain name getting dynamically from the sites array.
$ - It matches the end of a line

edited Dec 7, 2022 at 7:17

answered Dec 5, 2022 at 8:36

Rohìt Jíndal

27.4k16 gold badges80 silver badges133 bronze badges

4 Comments

Peter Thoeny Over a year ago

You need to test for word boundary, this gives a false positive for www.mysite1.com

Rohìt Jíndal Over a year ago

@PeterThoeny It is working fine for www.mysite1.com as well. Here is the demo jsfiddle link. jsfiddle.net/fq10e368

Peter Thoeny Over a year ago

Yes, that is the point, read the OP's "Second problem was" paragraph

Rohìt Jíndal Over a year ago

@PeterThoeny - Thanks for pointing that out. I updated my answer. I hope it will work as per the expectation now.

Collectives™ on Stack Overflow

Modify array to make it usable with RegExp

3 Answers 3

2 Comments

2 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related