I am working on scrapy , i am scraping a site and using xpath to scrape items.
But some of the div contains javascript, so when i used xpath until the div id that contains javascript code is returning an empty list,and without including that div element(which contains javascript) can able to fetch HTML data
HTML code
<div class="subContent2">
<div id="contentDetails">
<div class="eventDetails">
<h2>
<a href="javascript:;" onclick="jdevents.getEvent(117032)">Some data</a>
</h2>
</div>
</div>
</div>
Spider Code
class ExampleSpider(BaseSpider):
name = "example"
domain_name = "www.example.com"
start_urls = ["http://www.example.com/jkl/index.php"]
def parse(self, response):
hxs = HtmlXPathSelector(response)
required_data = hxs.select('//div[@class="subContent2"]/div[@id="contentDetails"]/div[@class="eventDetails"]')
So how can i get text(Some data) from the anchor tag inside the h2 element as mentioned above, is there any alternate way for fetching data from the elements that contains javascript in scrapy
