I am writing a simple Java source file parser in Python. The main objective is to extract a list of method declarations. A method starts with public|private|protected (I assume there are no friendly methods without an access modifier, which is acceptable in my code base) and ends with a { but can't contain ; (could be multiple lines).
So my current regex pattern looks like:
((public|private|protected).*\n*.*?({|;))
I am not sure how to say the entire match group can't contain ; so I was trying to say get me something that ends with either { or ;, whichever comes first, non-greedy. However, that doesn't work and here is a chunk where it fails:
private static final AtomicInteger refCount = new AtomicInteger(0);
protected int getSomeVar() {
You can see that there is a variable declaration before the method declaration that starts with private but it does not have a {. So this is returned as one match and I wanted to have it as two matches, then I would be discarding the variable declaration in separate non-regex logic. But if you know how to exclude a ; before {, that would work too.
Essentially, how do I tell in a Python regex expression that a certain character (or a sub pattern) must not occur within the main pattern?
(?s)(public|private|protected).*?[{;]enough or am I missing something ?(?s)sets thesmodifier to match newlines with.(dots). So.*?will match newlines if thesmodifier is set.