2

I have rather large project that uses ICU regex classes. Basically it might run in single-threaded mode, and in multi-threaded mode. In latter case all threads initialize own internal data (including regexes they use).

Originally project used shared_ptr to RegexPattern class to store regular expression for further use. I identified RegexPattern::matcher() call to be a bottleneck as it involves extra memory allocation to allocate new RegexMatcher class, so I decided to switch to store shared_ptr to RegexMatcher, and just call reset(str) before calling match.

I want to stress again - regexes are not shared between threads.

So it all went fine in single-threaded mode, and app worked slightly faster as I expected. However when I tried to run ~10 processing threads at once ICU library started to give weird results - in debug build some parts of data were partially initialized, some invalid values poped up here and there.

I looked at the ICU code and don't see any static stuff that might cause such behavior.

So the questions are (mostly they cause by the lack of appropriate documentation): 1) Is it valid scenario to store RegexMatcher instead of RegexPattern (RegexMatcher has a member pointing to the pattern being used)? 2) Are there any limitations on multithreading usage of ICU regexes not listed in documentation?

Just to note: my dev platform is Visual C++ 2010, compiling for Win32

Note: I was not able to reproduce such weird behavior in isolated test application that does only regex matching in 10 threads simultaneously, that's why questions are rather open-ended.

2
  • Probably worth filing a bug for, especially if you can include a small test case Commented Mar 19, 2012 at 16:29
  • Creation of small test case is problematic. Probably I'll spend some more time just debugging the issue and then will post results Commented Mar 20, 2012 at 17:13

1 Answer 1

2

Actually I was wrong - there is a case when single regexp is used from different threads. Obviously it cases issues when using RegexMatcher instead of RegexPattern

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.