1

I have this string which id like to delimit using Java Pattern. There is also a carriage return character after the first line. The delimiter character is |

MSH|^~\&|Unicare^HL7CISINV10.00.16^L||IBA||||ADT^A03|3203343722|P|2.3.1|||||
EVN|A03

I used the following code.

Pattern pattern = Pattern.compile("([^|]++)*");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
   System.out.println("Result: \"" + matcher.group() + "\"");
}

Doing this basically shows empty characters for each of the delimiter character. I would like find to ignore these. Any chance of modifying the regex so the characters can be ignored.

Thanks in advance.

7
  • i don't understand why you didn't try it while you've already wrote the code for it? Commented May 19, 2013 at 23:59
  • Checked but didnt work. Forgot to mention it above. Commented May 20, 2013 at 0:02
  • ah sorry then. i suggest this article you may like: vogella.com/articles/JavaRegularExpressions/article.html Commented May 20, 2013 at 0:03
  • What is "I'd like to delimit"? Commented May 20, 2013 at 0:06
  • 1
    How should your expected output look like? Does it help if you change your regex from ([^|]++)* to [^|]++? Commented May 20, 2013 at 0:22

1 Answer 1

5

I believe String#split() is simpler for your needs:

String src = "MSH|^~\\&|Unicare^HL7CISINV10.00.16^L||IBA||||ADT^A03|3203343722|P|2.3.1|||||\r\nEVN|A03\r";;
String[] ss = src.split("\\|+");
for (String s : ss) {
    System.out.println(s);
}

Output:

MSH
^~\&
Unicare^HL7CISINV10.00.16^L
IBA
ADT^A03
3203343722
P
2.3.1
                                 <--- there is a \r\n in the string at this point
EVN
A03

If you wanna go about using Pattern, you can use the regex [^|]+:

String str = "MSH|^~\\&|Unicare^HL7CISINV10.00.16^L||IBA||||ADT^A03|3203343722|P|2.3.1|||||\r\nEVN|A03\r";;
String[] ss = str.split("\\|+");
for (String s : ss) {
    System.out.println("Split..: \"" + s + "\"");
}
Pattern pattern = Pattern.compile("[^|]+");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
   System.out.println("Pattern: \"" + matcher.group() + "\"");
}

Output (exactly the same for both):

Split..: "MSH"
Split..: "^~\&"
Split..: "Unicare^HL7CISINV10.00.16^L"
Split..: "IBA"
Split..: "ADT^A03"
Split..: "3203343722"
Split..: "P"
Split..: "2.3.1"
Split..: "
EVN"
Split..: "A03
"
Pattern: "MSH"
Pattern: "^~\&"
Pattern: "Unicare^HL7CISINV10.00.16^L"
Pattern: "IBA"
Pattern: "ADT^A03"
Pattern: "3203343722"
Pattern: "P"
Pattern: "2.3.1"
Pattern: "
EVN"
Pattern: "A03
"
Sign up to request clarification or add additional context in comments.

2 Comments

excellent. I was thinking of using patterns but i could go use split as well.
Is it really wise to have the split pattern match multiple | characters? Couldn't || indicate a null field?!?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.