0

I am parsing some text with Python and am running into an odd issue...

an example text that is being parsed:

msg:"ET WEB_SPECIFIC_APPS ClarkConnect Linux proxy.php XSS Attempt"; flow:established,to_server; content:"GET"; content:"script"; nocase; content:"/proxy.php?"; nocase; content:"url="; nocase; pcre:"//proxy.php(?|.[\x26\x3B])url=[^&;\x0D\x0A][<>"']/i"; reference:url,www.securityfocus.com/bid/37446/info; reference:url,doc.emergingthreats.net/2010602; classtype:web-application-attack; sid:2010602; rev:4; metadata:created_at 2010_07_30, updated_at 2010_07_30;

my regex:

msgSearch = re.search(r'msg:"(.+)";",line)

actual result:

ET WEB_SPECIFIC_APPS ClarkConnect Linux proxy.php XSS Attempt"; flow:established,to_server; content:"GET"; content:"script"; nocase; content:"/proxy.php?"; nocase; content:"url="; nocase; pcre:"//proxy.php(?|.[\x26\x3B])url=[^&;\x0D\x0A][<>"']/i

expected result:

ET WEB_SPECIFIC_APPS ClarkConnect Linux proxy.php XSS Attempt

There are 10s of thousands of lines of text that I am parsing that are all giving me similar results. Any reason regex is picking a (seemingly) random "; to stop at? I can fix the example above by making the regex more specific, eg. r'msg:"([\w\s\.]+)";" but other lines have different characters included. I guess I could just include every special character in my regex, but I'm trying to understand why my wildcard isn't working properly.

Any help would be appreciated!

2

2 Answers 2

1

Try this one:

re.search(r'msg:"([^;]+)";',line)
Sign up to request clarification or add additional context in comments.

Comments

1

The .+ is by default "greedy", i.e. it will match as many characters as possible. In your case, it will stop at the last "; sequence, not at the next one. To make it non-greedy (or lazy), try .+? :

 msgSearch = re.search(r'msg:"(.+?)";",line)

1 Comment

thanks for the explanation about .+ being greedy! for some reason, (.+?) returned an empty string :( but the other examples of using the "everything except" regex (([^;]+) and ([^\"]+)) worked out.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.