I have the following string, and I want to parse out the link.
string =
'<td scope="row"><a href="/Archives/edgar/data/886982/000076999319000460/xslForm13F_X01/InfoTable_2019-08-09_Final.xml">InfoTable_2019-08-09_Final.html</a></td>None
So essentially grab everything between 'href=' and '">'
The result should be: /Archives/edgar/data/886982/000076999319000460/xslForm13F_X01/InfoTable_2019-08-09_Final.xml
This is what I've tried:
test = re.search('(?<=href).?(?=.xml)', final_link_str)*
and for kicks and giggles I tried this as well, to grab everything after href,
test = rtest = re.search('(?<=href).', final_link_str)*
No matter what I do, the output is only a part of the entire link.
Here is the result I'm getting:
<re.Match object; span=(23, 163), match='="/Archives/edgar/data/886982/000076999319000460/>