2

I'm working on an app that receives feedback from customers via email about a particular product. Currently I'm using java matcher and pattern classes to use regex's to parse certain snippets and keywords.

GIVEN: Customers email us from any email client all over the world (except APAC).

ASK: Do I need to prefix all my my regex's with \\p{L} and/or \\p{M} or can I just use
\\Q<my regex>\\E (wrap my regex with \\Q and \\E)?

1
  • 1
    Uhmwait, what does this have to do with Unicode support? \Q and \E are here to quote a string literal within a regex (like Pattern.quote() as the answer says) Commented Apr 3, 2014 at 19:41

1 Answer 1

1

You could try:

Pattern.quote(yourString);

It's the equivalent of \Q \E

\Q ... \E is used for literal matching, meaning you would need to know the string to match in advance. For example using, \Qпривет мир\E would match:

привет мир

Although using \Q.*\E wouldn't match — it would match:

.*

So if you're looking to match a string like привет мир, or merhaba dünya (both) you would want to use something such as \p{M}*\p{L}+, which would capture each word (привет, мир, etc.) or perhaps \X+ which would capture the whole string привет мир.

Sign up to request clarification or add additional context in comments.

3 Comments

thx, however, that seems to work for handling unicode points and marks but it breaks other test cases where meta characters are used. also "." (period) is treated as a literal instead of it's meta equivalent (especially for multi-line and doitall).
yeah, the issue is that sometimes I don't know a specific string. sometimes i want to match a bunch of stuff like ^startjunk.*endjunk:$. also, there are other unicode surrogates and marks (\uXXXX \uYYY or \uZZZZ) that are getting inserted in between and are breaking some of my regexes. point being is that I don't know exactly what I can search for to escape as literal. and like I said a lot of these regexes are using dotall. i can play around with your edited suggestion.
Yes, I understand. Take a look at this example: regex101.com/r/gK2kQ4. It might shed some light on how to combine the two.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.