I'm trying to make an expression that will search through a page like how2bypass.co.cc and return the contents of the "action" attribute in the "form" tag, and the contents of the "name" and "type" attributes in any input tags. I can't use an html parser because my ultimate goal is to automatically detect if a given page is a web proxy, and once sites catch on that I'm doing that they're probably going to start doing silly things like writing the entire document with javascript to stop me from parsing it.
I'm using the code
preg_match_all('/<form.*action\="(.*?)".*>[^<]*<input.*type\=/i', $pageContents, $inputMatches);
which works fine for the action attribute, but once I put a " after type\= the code stops working. why is this? It works fine once, but not twice?