1

How do I match a specific pattern and exclude specific sub-strings with regex case insensitive

I am trying to write Regex for New Zealand address validation. This is the valid character set which I want to capture case insensitive which includes A to Z numbers and letters hyphen "-" and forward slash "/" as well as Maori accented characters for Maori vowels ā, ē, ī, ō, ū and works.

....

var regex = /^[\/A-zĀ-ū0-9\s\,\''\-]*$/;

....

It needs to exclude the following sub-strings case insensitive, with or without spaces to be valid

PO Box

Private Bag

(tricky as both those sub-strings could include spaces or not and be upper or lower case depending on how the user types them)

and the string must start with a number to be valid

e.g.

This is invalid:

Flat 1 311 Point Chevalier Road, Point Chevalier, Auckland 1022, New Zealand

This is valid:

1/311 Point Chevalier Road, Point Chevalier, Auckland 1022, New Zealand

311/1 or 311-1 or 1-311 are all considered valid by NZ Postal Service.

example if this if statement is true considering the regex above then the address string is invalid:

// Allowed character set
var regex = /^[\/A-zĀ-ū0-9\s\,\''\-]*$/;

// Get the address string and convert to lowercase for evaluation pseudocode
var str = getValue().toLowerCase();

// Strip spaces 
str = str.replace(/\s/g, '');

// If the sub string "pobox" or sub string "privatebag" or string doesn't start with a number or doesn't match allowed character set address is invalid
if(str.includes("pobox") || str.includes("privatebag") || (str.match(/^\d/) == null) || (!regex.test(str))){

....

Thanks I really appreciate the input of the community and I know there are Regex gurus out there. I am trying to simplify this so I can use HTML5 form validation rather than a clunky JavaScript evaluation.

1
  • To require a digit as first chracter: ^\d ... for disallowing some condition at certain points in your pattern eg a neg. lookahead can be used. Together maybe something like this demo. Commented Aug 25, 2022 at 10:30

1 Answer 1

2

One option is to first match the expressions you want to exclude and then regex with the allowed character set.

/^.*(po\s*box|private\s*bag).*$|^\d[\/a-zĀ-ū0-9\s\,\'\-]*$/i

By capturing the excluded patterns, you can check if group 1 has a value. If so, you know that the string should be skipped

var regex = /^.*(po\s*box|private\s*bag).*$|^\d[\/a-zĀ-ū0-9\s\,\'\-]*$/i;
var match = str.match(regex);

if (match && !match[1]) {
   // valid address
} else {
   // invalid address
}

See https://regex101.com/r/AiL6Dy/2

Sign up to request clarification or add additional context in comments.

4 Comments

that is just awesome and works for every scenario except one requirement which is the string must start with a number. If I can tweak it to include that requirement it is perfect. e.g. jsfiddle.net/jeremy_tactical/ad71ey6k/4
Hi Arnold, I am still a real noob when it comes to regex, how do I also check if the string starts with a number combined with your beautiful regex you wrote, I attempted var regex = /^.*(po\s*box|private\s*bag).*$|^\d|^[\/a-zĀ-ū0-9\s\,\'\-]+$/i; obviously wrong though.
@JeremyLeys I've updated regex. Now it says, start with a number followed by zero or more valid chars. If you want the number followed by 1 or more valid chars, use + instead of *. Basically, that part is just regular regexp. Use regex101 to play around with it. It also shows an explanation.
Arnold thank you this is a brilliant solution, you are awesome.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.