I process HTML in Python with the help of the lxml library. I am trying to parse this website, my objective is to parse out all the games that happened in the regular season(not the ones in play-off or pres-eason). The problem that I have encountered:
I select all elements that have the class nob-border, which I can do.
subpage.cssselect(".nob-border")
The library lxml has this function cssselect which allows to select HTML elements with CSS selectors. What I would like to do next, is select every element until the next tr element that has the class nob-border. The HTML looks like this:
<tr class="center nob-border">
<tr class="table-dummyrow">
<tr class="odd deactivate" xeid="IqLK6ZNT">
<tr class=" deactivate" xeid="l0Xo8yvB">
<tr class="odd deactivate" xeid="QLnrBc9b">
<tr class=" deactivate" xeid="8pxmAHO4">
<tr class="odd deactivate" xeid="nVmvCwfh">
<tr class=" deactivate" xeid="v1lEBJvn">
<tr class="center nob-border">
There are rows with the class nob-border and a bunch of rows between those rows. I need to select those in between. More than that I don't want to just select all the rows in between, I want to select for every row with the nob-border class the ones that are below that row and above the next one with the class nob-border. I hope was I clear enough, if not do not hesitate on asking questions.