Finding CSS selectors for a specific website

Question

I have been trying to scrape product names and prices from this website (https://dentalspeed.com/?fbclid=IwAR1_gjjWAevu1pgikjwLUqeFXzjBRo7A93uXFSIAasxlvl97ptEorNP1fDo) but unfortunately i can't get CSS selectors right. I have also used CSS selector gadget. I also know html and css and i have read it myself. I think the css selectors are right but i just can't extract data for some reason.

   def parse(self, response):
      

        items = DenItem()
        all_div = response.css('div.collection-product')
       
        for div in all_div:
            product_name = div.css(".collection-product-name font font::text").extract()
            _new_price = div.css('div.collection-product-price > a > font > font::text').extract()  # .replace("Rs", "")
            _new_price = [s.replace("$", "") for s in _new_price]
            _new_price = [s.replace(",", "") for s in _new_price]
            _old_price = div.css("main#setembro section:nth-child(5) > div > div > div > div > ul > div.owl-wrapper-outer > div > div:nth-child(3) > li > div > div.collection-product-price-content > p.collection-product-price > del > font > font::text").extract()  # .replace("Rs", "")
            _old_price = [n.replace("R $", "") for n in _old_price]
            _old_price = [n.replace(",", "") for n in _old_price]
            items['product_name'] = product_name
            items['_new_price'] = _new_price
            items['_old_price'] = _old_price
            if len(items['_new_price']) == 0:
                items['_new_price'] = '0'
            if len(items['_old_price']) == 0:
                items['_old_price'] = '0'

            yield items

Share the code you have so far, without it you're just asking for someone to do it for you. — meshtron
– meshtron, Commented Sep 25, 2019 at 12:36
which products ? There are lots. Also, div.collection-product produces no results for me so you wouldn't be performing a loop. — QHarr
– QHarr, Commented Sep 25, 2019 at 12:53

QHarr · Accepted Answer · 2019-09-25 12:55:23Z

1

I find content dynamically returned from another url. You can find this in the network tab when refreshing the page with F5.

import requests

r = requests.get('https://dentalspeed.com/vitrines/app-vitrine__home--estetica').json()
print(r)

Depending on full list of products you want (you may need to track other urls)

answered Sep 25, 2019 at 12:55

QHarr

84.5k14 gold badges58 silver badges105 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ibtsam Ahmad Over a year ago

i don't get it?

QHarr Over a year ago

products and prices are dynamically loaded. For example, Escolha Ofertas Por Especialidade, the info for these products is loaded from the url I show above. When you use a browser it makes additional requests (not just the url you start with) for information to other uris. Using requests these additional requests are not captured. You can however, use the browser network tab to find out what these other requests are and then use requests to make xhr calls to those uris.

Collectives™ on Stack Overflow

Finding CSS selectors for a specific website

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related