Hi All/Python'ers/RegEx'ers,
I'm working lab exercise, learning Python RE package. I've got this data. I want to grab only the data between HTML tags. I tried this "[^(</?\w+>)]\d+" i.e. exclude all HTML tags TBODY or TD or /TD etc
It misses out first data 1850
<TBODY><TR><TD>1850</TD><TD>John</TD><TD>-0.339</TD><TD>-0.425</TD></TR></TBODY>
I'm trying
re.findall("[^(<\/?\w+>)]\d+", html_line)
Trying this "(<\/?\w+>)" grouping gets me all the HTML tags, I just to exclude ALL HTML tags,
just opposite, so, I tried [^(<\/?\w+>)]
Thanks in Advance. N. PS: Part of problem is, I shouldn't be using BeautifulSoup