I want to search a string for an identifier. The identifier can 4 have variations
REF964758362562
REF964-758362-562
964758362562
964-758362-562
The identifier can be located anywhere in a string or on it own. Example:
Lorem ipsum REF964-758362-562
Lorem ipsum ABCD964-758362-562 lorem ipsum
Lorem ipsum REF964-758362-562 lorem ipsum
REF964-758362-562 Lorem ipsum 1234-123456-22
Lorem ipsum 964-758362-562 lorem ipsum
REF964758362562
REF964-758362-562
964758362562
964-758362-562
When a hyphen/dash character is used in the identifier, the hyphen will always appear after the third and 9th digits as shown in the examples.
Here is what i have come up with but i suspect that the regular expression is getting too long and it can probably be shortened. This also does work well when the identifier is not at the beginning of the string. Any tips/ideas?
^[A-Z]*REF[A-Z]*([12]\d{3})(\d{6})(\d{2})$|^([12]\d{3})(\d{6})(\d{2})[A-Z]*REF[A-Z]*|^([12]\d{3})(\d{6})(\d{2})$
I have put them in groups because once i have extracted the identifiers, i want to add the hyphen if the identifier does not have a hyphen. For example, if the
identifier extracted is 964758362562, i want to save it as 964-758362-562.
Here are some tests i have run and as you can see not a lot of them match
testRegex = "^[A-Z]*REF[A-Z]*([12]\\d{3})(\\d{6})(\\d{2})$|^([12]\\d{3})(\\d{6})(\\d{2})[A-Z]*REF[A-Z]*|^([12]\\d{3})(\\d{6})(\\d{2})$";
PATTERN = Pattern.compile(testRegex, Pattern.CASE_INSENSITIVE);
m = PATTERN.matcher("Lorem ipsum REF964-758362-562");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("REF964-758362-562 Lorem ipsum 1234-123456-22");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("Lorem ipsum 964-758362-562 lorem ipsum");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("Lorem ipsum ABCD964-758362-562 lorem ipsum");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("REF964758362562");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("REF964-758362-562");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("964758362562");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
m = PATTERN.matcher("964-758362-562");
if(m.matches()) {
System.out.println("Match = " + m.group());
}else{
System.out.println("No match");
}
Output
No match
Match = Not known
No match
No match
No match
No match
No match
No match
No match
No match