3

I have sample C++ code (http://pastebin.com/6q7zs7tc) from which I have to extract functions names as well as the number of parameters that a function requires. So far I have written this regex, but it's not working perfectly for me.

(?![a-z])[^\:,>,\.]([a-z,A-Z]+[_]*[a-z,A-Z]*)+[(]
3
  • 1
    First, you should say what you expect it to do and what it actually does that is not perfect. Second, you probably need some sort of parser a bit more powerful than regex. C++ is not a regular language (and people will argue all day about whether it is context free - please don't go there). Commented Mar 3, 2015 at 13:55
  • 1
    No regex can do this job perfectly. Doing this (even close to) truly correctly is a seriously non-trivial task, but if you really need to do it, you can use something like CLang (but even just making use of CLang isn't trivial). Commented Mar 3, 2015 at 13:57
  • You can't do this with a regex. Commented Mar 3, 2015 at 17:12

1 Answer 1

6

You can't parse C++ reliably with regex.

In fact, you can't parse it with weak parsing technology (See Why can't C++ be parsed with a LR(1) parser?). If you expect to get extract this information reliably from source files, you will need a time-tested C++ parser; see https://stackoverflow.com/a/28825789/120163

If you don't care that your extraction process is flaky, then you can use a regex and maybe some additional hackery. Your key problem for heuristic extraction is matching various kinds of brackets, e.g., [...], < ... > (which won't quite work for shift operators) and { ... }. Bracket matching requires you to keep a stack of seen brackets. And bracket matching may fail in the presence of macros and preprocessor conditionals.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.