Scrape data from multiple webpages using Scrapy

Question

I am trying to extract the phone titles (and eventually other data) from multiple web pages using scrapy. I am trying to do this with defined functions. the "parse" function is supposed to pull all of the page links, which it does do correctly if I let it yield its results to a CSV. However when I try to set up a second "parse_pages" it seems that the code won't even try to process and i cannot get a CSV output of just the titles for each page

note: i recognize the indenting is wrong below for the functions,

import scrapy
from scrapy.http import Request

url = 'https://www.gsmarena.com/'

class PhonelinksSpider(scrapy.Spider):
    name = 'phonelinks'
    allowed_domains = ['www.gsmarena.com/results.php3?']
    start_urls = ['https://www.gsmarena.com/results.php3?']

    def parse(self, response):
        links = response.xpath('//div[@class="makers"]/ul/li/a/@href').extract()
        for link in links:
            location = url+link
            yield response.follow(url = location,callback = self.parse_pages)



    def parse_pages(self, response):
       phones = response.xpath('//h1[contains(@class,"specs-phone-name-title")]/text()').extract_first().strip()
       for title in phones:
           phone_list = {'phone':title}
           yield phone_list

Where do you call these functions? Please see: minimal reproducible example. — AMC
– AMC, Commented Nov 11, 2019 at 21:29

gangabass · Accepted Answer · 2019-11-11 21:30:54Z

1

Here

phones = response.xpath('//h1[contains(@class,"specs-phone-name-title")]/text()').extract_first().strip()

extract_first() returns a string or None that's why you can iterate it on next line.

def parse_pages(self, response):
   title = response.xpath('//h1[contains(@class,"specs-phone-name-title")]/text()').extract_first().strip()
   yield {'phone':title}

answered Nov 11, 2019 at 21:30

gangabass

10.7k2 gold badges26 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Scrape data from multiple webpages using Scrapy

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related