0

This is my code and i am trying to access a review from this site but showing an error.

class DomainCrawlSpider(BaseSpider):
    name = "Spider"
    allowed_domains = ["www.smahavarkar.wordpress.com"]
    start_urls = "http://smahavarkar.wordpress.com/"

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        titles = hxs.select("//p")
        items = []
        for titles in titles:
            item = DItem()
            item ["address"] = titles.select("a/text()").extract()
            item ["review1"] = titles.select("p/text()").extract()
            item.append(item)
        return item
1
  • ValueError: Missing scheme in request url: h Commented Dec 3, 2014 at 14:58

2 Answers 2

2

start_urls should be a list, try changing to:

start_urls = ["https://www.zomato.com/cs/mumbai/restaurace?q=pop%20tates"]
Sign up to request clarification or add additional context in comments.

4 Comments

what is the full traceback?
though I'm pretty sure your problem is that it should be items.append(item).
SOrry for mistake but How do i fix it it seems all code is correct
item.append(item) becomes items.append(item)
0

Change start_urls to:

start_urls = ("http://smahavarkar.wordpress.com/",)

It worked for me.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.