I've got a MongoDB collection, which looks like this:
// sites
// note that these urls all have paths, this is important.
// The path can only be longer, e.g. amazon.com/Coffee-Mug
[
{
name: "MySite",
urls: ['google.com/search', 'amazon.com/Coffee', 'amazon.com/Mug']
},
{
name: "OtherSite",
urls: ['google.com/search', 'microsoft.com/en-us']
}
]
What I'm trying to do is the following:
class Service {
/**
* @param url Is a full url, like "https://www.google.com/search?q=stackoverflow"
* or "https://www.amazon.com/Coffee-Program-Ceramic-Makes-Programmers/dp/B07D2XJLLG/"
*/
public async lookup(findUrl: string) {
const trimmed = trim(findUrl); // remove variables and https, but NOT the path!
// return the "Site" in which the base url is matched with the full url
// see description below
}
}
For example, using these cases
Case 1:
url = 'https://www.amazon.com/Coffee-Program-Ceramic-Makes-Programmers/dp/B07D2XJLLG/'- returned site(s):
[MySite]
Case 2:
url = 'https://www.google.com/search?q=stackoverflow'- returned site(s):
[MySite, OtherSite]
Case 3 (same as case 1 but with other value):
url = 'https://www.microsoft.com/en-us/surface'- returned site(s):
[OtherSite]
Case 4 (when not to match):
url = 'https://microsoft.com/nl-nl'ORurl = 'https://microsoft.com'- returned site(s):
[]
I've tried to do something like this:
Site.find({ url: { $in: trimmed }})
Above kind of works, but the problem is, this only does exact matches. I want to match the url from MongoDB with the url provided by the function. How does one do this?
I've received the suggestion to use check if field is substring of a string or text search on MongoDB, but this is too inaccurate. I can basically enter the base domain without a path and it will find it, this is definitely not supposed to be happening.
regexMatchaggregation only works on strings, not string arrays. Thematchdoes not return anything at all. Using:js db.getCollection("sites").aggregate([ { $match: { $or: [ { urls: { $in: ["microsoft.com/en-us/dadadadada/blablabla"] } } ] } } ], { collation: { locale: "en", strength: 1 } });Nonetheless,regexMatchAllseems to be a thing for arrays, but that's something I'm figuring out currently.