Data cannot retrieve with multiple class name in scrapy using python

Question

I need to get the data from html but response.css, response.xpath and combination is not working whenever I tried to get the "regular-price" data it always says "none"

I need to get the value text of enter code here which $17.99

here's my code

HTML

<div class="price parbase"><div class="primary-row product-item-price product-item-price-discount"> <span class="price-value">$12.99</span><small class="js-price-value-original price-value-original">$17.99</small> </div> </div>

Scrapy python

def parse_subpage(self, response):
    item = {
    'title': response.css('h1.primary.product-item-headline::text').extract_first(),
    'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(), 
    'regular-price': response.css('.js-price-value-original').xpath("@small").extract_first(),
    'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
    'description': response.css('p.pdp-description-text::text').extract_first()

        }   
    yield item

output should be regular-price: $17.99

please help thank you!

www2.hm.com/en_us/productpage.0697992001.html try this one this now works still need to get the original prce @KartikeyaSharma — Christian Read
– Christian Read, Commented Apr 3, 2019 at 13:40

vezunchik · Accepted Answer · 2019-04-03 13:40:49Z

1

Your link gives me 404, but by your html snippet you need only response.css('small.js-price-value-original::text').get(), there is no attribute small.

UPD: Hm, seems this data is rendered by JS. Check html code of page and you will see huge json, search by whitePrice keyword. You can retrieve such data, forxample with response.xpath('//script[contains(text(), "whitePrice")]/text()').re_first("'whitePrice'\s?:\s?'([^']+)'")

edited Apr 3, 2019 at 13:40

answered Apr 3, 2019 at 13:27

vezunchik

3,7173 gold badges20 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Christian Read Over a year ago

still not working, the output is 'regular-price': None, @vezunchik

Christian Read Over a year ago

www2.hm.com/en_us/productpage.0697992001.html try this one this now works still need to get the original prce @vezunchik

Kartikeya Sharma Over a year ago

Brilliant use of regular expression here!

edinho · Accepted Answer · 2019-04-03 13:37:48Z

If this sniped is the only html you have, you can do:

def parse_subpage(self, response):
    item = {
    'title': response.css('h1.primary.product-item-headline::text').extract_first(),
    'sale-price': response.xpath("normalize-space(.//span[@class='price-value']/text())").extract_first(),
    'regular-price': response.xpath('//div/small[contains(@class, "js-price-value-original") and contains(@class, "price-value-original")]/text()').extract_first(),
    'photo-url': response.css('div.product-detail-main-image-container img::attr(src)').extract_first(),
    'description': response.css('p.pdp-description-text::text').extract_first()

        }   
    yield item

Btw, the website you provided shows a file not found

Kartikeya Sharma · Accepted Answer · 2019-04-03 14:05:12Z

0

Thanks @vezunchik. If you want to use CSS selector. You can use the below code

response.css('script:contains("whitePrice")').re_first("'whitePrice'\s?:\s?'([^']+)'")

answered Apr 3, 2019 at 14:05

Kartikeya Sharma

1,3832 gold badges11 silver badges23 bronze badges

Collectives™ on Stack Overflow

Data cannot retrieve with multiple class name in scrapy using python

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related