I am trying to parse several items from a blog but I am unable to to reach the last two items I need.
The html is:
<div class="post">
<div class="postHeader">
<h2 class="postTitle"><span></span><a href="http://website.com" title="cuba and the cameraman">cuba and the cameraman</a></h2>
<span class="postMonth" title="2017">Nov</span>
<span class="postDay" title="2017">24</span>
<div class="postSubTitle"><span class="postCategories"><a href="http://website.com" rel="category tag">TV Shows</a></span></div>
</div>
<div class="postContent"><p><a target="_blank" href="https://image.com/test.jpg"><img class="aligncenter" src="https://image.com/test.jpg"/></a> <br />
n/A<br />
<br />
<strong>Links:</strong> <a target='_blank' href='http://www.imdb.com/title/tt7320560/'>IMDB</a><br />
</p>
The data I need is the "cuba and the cameraman" (code below), the "https://image.com/test.jpg" url and the "http://www.imdb.com/title/tt7320560/" IMDB link.
I managed to parse correctly only all the postTile for the website:
all_titles = []
url = 'http://test.com'
browser.get(url)
titles = browser.find_elements_by_class_name('postHeader')
for title in titles:
link = title.find_element_by_tag_name('a')
all_titles.append(link.text)
But I can't get the the image and imdb links using the same method as above , class name. COuld you support me on this? Thanks.