I'm using BeautifulSoup to extract some data from a search result from this website http://www.cpso.on.ca/docsearch/default.aspx
Here's a sample of the HTML code that's been .prettify()
<tr>
<td>
<a class="doctor" href="details.aspx?view=1&id= 72374">
Smith, Jane
</a>
(#72374)
</td>
<td>
Suite 042
<br />
21 Jump St
<br />
Toronto ON M4C 5T2
<br />
Phone: (555) 555-5555
<br />
Fax: (555) 555-555
</td>
<td align="center">
</td>
</tr>
Essentially every <tr> block has 3 <td> blocks.
I want the output to be
Smith, Jane Suite 042 21 Jump St Toronto ON M4C 5T2
I also have to separate entries by a new line.
I'm having problem writing the address which is stored in the 2nd <td> block.
I'm also writing this to a file.
Here's what I have so far... it doesn't work :p
for tr in soup.findAll('tr'):
#td1 = tr.td
td2 = tr.td.nextSibling.nextSibling
for a in tr.findAll('a'):
target.write(a.string)
target.write(" ")
for i in range(len(td2.contents)):
if i != None:
target.write(td2.contents[i].string)
target.write(" ")
target.write("\n")
forloop is missing a:, and the inner loop isn't indented. Is that the actual code or a posting mistake?<a></a>, so why do you expect your code to print it?