I have various data that I need to parse and get the weight out of it.
I'm using
- C++11
- std::regex
- Debian 9.9
- gcc 6.3.0
The problem is that sometimes segmentation fault occurs, it happens very rarely.
The input that throws the error mostly consist of just space and newline characters.
Here is the regex:
(?:\b(?:(kilogram\.*s*\.*|kg\.*s*\.*)(?:[^[:alnum:]])*)(?:\s*weight\s*)*(?:\s*is\s*|\s*are\s*)*)\W*([\d\.,]*\d+\b)|(?:(?:[\s\.]?|^)([\d\.,]*\d+)\W*(kilogram\.*s*\.*|kg\.*s*\.*)\b)
Example regex that works on regex101.com but throws segmentation fault in C++ on my Debian server regex101
Here are some more regex101 examples of input, just to fast get an idea of what regex is searching for.
Here is an example of C++ code that fails.
And here is the same C++ code that works, but using another online compiler (cpp.sh).
Can someone please help me to solve this segmentation fault problem?
Thank you.
(?:\s*weight\s*)*is a killing pattern causing too much backtracking.\b(k(?:ilogram|g)\.*s*\.*)\W*(?:\s+weight)?(?:\s+(?:is|are))?\W*(\d[0-9.,]*\b)|[\s.]?(\d[0-9.,]*)\W*(k(?:ilogram|g)\.*s*\.*)\bis the ECMAScript compatible. See demo.