I have a litle expression in PHP:
$search = array("'<(script|noscript|style|noindex)[^>]*?>.*?</(script|noscript|style|noindex)>'si",
"'<\!--.*?-->'si",
"'<[\/\!]*?[^<>]*?>'si",
"'([\r\n])[\s]+'");
$replace = array ("",
"",
" ",
"\\1 ");
$text = preg_replace($search, $replace, $this->pageHtml);
How i did run this on python? re.sub?
re.sub. Did you try it?re.sub?