How do I parse a table within a table for a simple web scraping application?
My process is:
- Create soup object of url
- find all tags with
- find tags within the already found tables
- find the rows
- find the headers
- find the data in the table
HTML code:
<table border="0" cellpadding="2" cellspacing="1" width="660">
<tbody><tr bgcolor="#CCCCCC">
<th class="IntWxHeader">Time</th>
<th class="IntWxHeader">Weather</th>
<th class="IntWxHeader">Temperature</th>
<th class="IntWxHeader">Dewpoint</th>
<th class="IntWxHeader">Humidity</th>
<th class="IntWxHeader">Pressure</th>
<th class="IntWxHeader">Winds</th>
<th class="IntWxHeader">Visibility</th>
</tr>
<tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">12:11AM</span></td><td align="center"><span style="font-size: 16px;">Cloudy</span></td><td align="center"><span style="font-size: 16px;">25°F/-4°C</span></td><td align="center"><span style="font-size: 16px;">25°F/-4°C</span></td><td align="center"><span style="font-size: 16px;">100%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSE 7 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">12:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Cloudy</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">100%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSE 5 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">1:55AM</span></td><td align="center"><span style="font-size: 16px;">Partly Cloudy</span></td><td align="center"><span style="font-size: 16px;">25°F/-4°C</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">93%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">S 5 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">2:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">19°F/-7°C</span></td><td align="center"><span style="font-size: 16px;">19°F/-7°C</span></td><td align="center"><span style="font-size: 16px;">100%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSE 7 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">3:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">19°F/-7°C</span></td><td align="center"><span style="font-size: 16px;">86%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 6 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">4:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">19°F/-7°C</span></td><td align="center"><span style="font-size: 16px;">86%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 8 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">5:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">18°F/-8°C</span></td><td align="center"><span style="font-size: 16px;">80%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">WSW 3 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">6:07AM</span></td><td align="center"><span style="font-size: 16px;">Freezing Fog</span></td><td align="center"><span style="font-size: 16px;">21°F/-6°C</span></td><td align="center"><span style="font-size: 16px;">18°F/-8°C</span></td><td align="center"><span style="font-size: 16px;">86%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NW 3 MPH</span></td><td align="center"><span style="font-size: 16px;">0.5mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">6:29AM</span></td><td align="center"><span style="font-size: 16px;">Freezing Fog</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">100%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NW 3 MPH</span></td><td align="center"><span style="font-size: 16px;">0.25mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">6:55AM</span></td><td align="center"><span style="font-size: 16px;">Freezing Fog</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">100%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">WNW 3 MPH</span></td><td align="center"><span style="font-size: 16px;">0.25mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">7:55AM</span></td><td align="center"><span style="font-size: 16px;">Mist and Fog</span></td><td align="center"><span style="font-size: 16px;">18°F/-8°C</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">93%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SE 3 MPH</span></td><td align="center"><span style="font-size: 16px;">mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">8:09AM</span></td><td align="center"><span style="font-size: 16px;">Mist and Fog</span></td><td align="center"><span style="font-size: 16px;">18°F/-8°C</span></td><td align="center"><span style="font-size: 16px;">16°F/-9°C</span></td><td align="center"><span style="font-size: 16px;">93%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">ESE 5 MPH</span></td><td align="center"><span style="font-size: 16px;">5mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">8:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">25°F/-4°C</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">93%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 9 MPH</span></td><td align="center"><span style="font-size: 16px;">7mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">9:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">36°F/2°C</span></td><td align="center"><span style="font-size: 16px;">27°F/-3°C</span></td><td align="center"><span style="font-size: 16px;">70%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 6 MPH</span></td><td align="center"><span style="font-size: 16px;">25mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">10:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">43°F/6°C</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">46%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 10 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">11:55AM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">48°F/9°C</span></td><td align="center"><span style="font-size: 16px;">21°F/-6°C</span></td><td align="center"><span style="font-size: 16px;">34%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SSW 9 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">12:55PM</span></td><td align="center"><span style="font-size: 16px;">Partly Cloudy</span></td><td align="center"><span style="font-size: 16px;">52°F/11°C</span></td><td align="center"><span style="font-size: 16px;">9°F/-13°C</span></td><td align="center"><span style="font-size: 16px;">17%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">SW 8 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">1:55PM</span></td><td align="center"><span style="font-size: 16px;">Partly Cloudy</span></td><td align="center"><span style="font-size: 16px;">54°F/12°C</span></td><td align="center"><span style="font-size: 16px;">5°F/-15°C</span></td><td align="center"><span style="font-size: 16px;">14%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">N 3 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">2:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">57°F/14°C</span></td><td align="center"><span style="font-size: 16px;">7°F/-14°C</span></td><td align="center"><span style="font-size: 16px;">13%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">WNW 6 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">3:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">57°F/14°C</span></td><td align="center"><span style="font-size: 16px;">3°F/-16°C</span></td><td align="center"><span style="font-size: 16px;">11%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NNW 7 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">4:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">59°F/15°C</span></td><td align="center"><span style="font-size: 16px;">3°F/-16°C</span></td><td align="center"><span style="font-size: 16px;">10%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NW 12 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">5:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">57°F/14°C</span></td><td align="center"><span style="font-size: 16px;">-6°F/-21°C</span></td><td align="center"><span style="font-size: 16px;">7%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">W 20 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">6:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">54°F/12°C</span></td><td align="center"><span style="font-size: 16px;">-2°F/-19°C</span></td><td align="center"><span style="font-size: 16px;">10%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">WNW 15 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">7:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">46°F/8°C</span></td><td align="center"><span style="font-size: 16px;">5°F/-15°C</span></td><td align="center"><span style="font-size: 16px;">18%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NW 7 MPH</span></td><td align="center"><span style="font-size: 16px;">70mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">8:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">45°F/7°C</span></td><td align="center"><span style="font-size: 16px;">9°F/-13°C</span></td><td align="center"><span style="font-size: 16px;">23%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">W 7 MPH</span></td><td align="center"><span style="font-size: 16px;">25mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">9:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">41°F/5°C</span></td><td align="center"><span style="font-size: 16px;">12°F/-11°C</span></td><td align="center"><span style="font-size: 16px;">31%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">NW 9 MPH</span></td><td align="center"><span style="font-size: 16px;">25mi.</span></td></tr><tr style="line-height:20pt; background-color:#E6EFFF;"><td align="center"><span style="font-size: 16px;">10:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">43°F/6°C</span></td><td align="center"><span style="font-size: 16px;">14°F/-10°C</span></td><td align="center"><span style="font-size: 16px;">31%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">WNW 10 MPH</span></td><td align="center"><span style="font-size: 16px;">25mi.</span></td></tr><tr style="line-height:20pt; background-color:#F8FBFF;"><td align="center"><span style="font-size: 16px;">11:55PM</span></td><td align="center"><span style="font-size: 16px;">Mostly Clear</span></td><td align="center"><span style="font-size: 16px;">34°F/1°C</span></td><td align="center"><span style="font-size: 16px;">23°F/-5°C</span></td><td align="center"><span style="font-size: 16px;">65%</span></td><td align="center"><span style="font-size: 16px;">0.00</span></td><td align="center"><span style="font-size: 16px;">N 21 MPH</span></td><td align="center"><span style="font-size: 16px;">25mi.</span></td></tr></tbody></table>
I want to grab all of the data in the last line (
Here is what I have so far:
try:
resp = urlopen(url)
except URLError as e:
print "An error occured fetching %s \n %s" % (url, e.reason)
return 1
soup = BeautifulSoup(resp.read(), 'lxml')
# get outer table
try:
outerTables = soup.findAll('table')
except AttributeError as e:
print "No tables found, exciting"
return 1
# get inner table
try:
for table in outerTables:
innerTables = soup.findAll('table')
except AttributeError as e:
print "No inner tables found"
return 1
# get rows
try:
for table in innerTables:
rows = soup.findAll('tr')
except AttributeError as e:
print "No rows found"
return 1
# get headers
headers = []
try:
for row in rows:
markup = soup.findAll('th')
except AttributeError as e:
print "No headers found"
return 1
for i in range(0,len(markup)):
headers.append(markup[i].string)
Am I on the right path here?