0

I have some problem with my regular expression. I need to find all functions in text. I have this regular expression \w*\([^(]*\). It works fine until text does not contais brackets without function name. For example for this string 'hello world () testFunction()' it returns () and testFunction(), but I need only testFunction(). I want to use it in my c# application to parse passed to my method string. Can anybody help me? Thanks!

2
  • 2
    I am not certain it can be done with regular expression, at the very least - regex is not the best tool for this. All programming languages I am familiar with are syntatically context-free languages and not regular languages. Commented Apr 24, 2012 at 13:52
  • [^(]* doesn't make any sense, at least do a [^)]* to make it non-greedy. Commented Apr 24, 2012 at 14:00

5 Answers 5

2

Programming languages have a hierarchical structure, which means that they cannot be parsed by simple regular expressions in the general case. If you want to write correct code that always works, you need to use an LR-parser. If you simply want to apply a hack that will pick up most functions, use something like:

\w+\([^)]*\)

But keep in mind that this will fail in some cases. E.g. it cannot differentiate between a function definition (signature) and a function call, because it does not look at the context.

Sign up to request clarification or add additional context in comments.

Comments

1

Try \w+\([^(]*\)

Here I have changed \w* to \w+. This means that the match will need to contain atleast one text character.

Hope that helps

1 Comment

Thanks for your answer but it returns hello world and testFunction :(
1

Change the * to + (if it exists in your regex implementation, otherwise do \w\w*). This will ensure that \w is matched one or more times (rather than the zero or more that you currently have).

1 Comment

Thanks for your answer but it returns hello world and testFunction :(
1

It largely depends on the definition of "function name". For example, based on your description you only want to filter out the "empty"names, and not want to find all valid names.

If your current solution is largely enough, and you have problems with this empty names, then try to change the * to a +, requiring at least one word character right before the bracket.

\w+([^(]*)

OR

\w\w*([^(]*)

Depending on your regexp application's syntax.

1 Comment

Thanks for your answer but it returns hello world and testFunction :(
1

(\w+)\(

regex groups would have the names of variables without any parentesis, you can add them later if you want, i supposed you don't need the parameters.

If you do need the parameters then use:

\w+\(.*\)

for a greedy regex (it would match nested functions calls)
or...

\w+\([^)]*\)

for a non-greedy regex (doesn't match nested function calls, will match only the inner one)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.