I have URLs in a text that look like this:
<https://buy.itunes.apple.com/WebObjects/MZFinance.woa/wa/reportAProblem?p
=22000073760328&o=i>
I've used the following pattern to try and remove them:
re.sub(r'\<http.+?\>', '', plain, re.S)
But it won't get them all, for example, this one doesn't get removed:
<http://ax.phobos.apple.com.edgesuite.net/email/images_shared/spacer_99999\r\n9.gif>
r'<http://ax.phobos.apple.com.edgesuite.net/email/images_shared/spacer_99999\r\n9.gif>') or put double backslash (\\) (<http://ax.phobos.apple.com.edgesuite.net/email/images_shared/spacer_99999\\r\\n9.gif>) it will workre.match(r'.', '\n', re.S)works, butre.sub(r'.', '', '\n', re.S)does not. So it seems to match, but the replacing part fails somehow... really not sure where or how though. It's as if re.S doesn't work forre.sub.