0

I have been trying to scrape product names and prices from this website (https://dentalspeed.com/?fbclid=IwAR1_gjjWAevu1pgikjwLUqeFXzjBRo7A93uXFSIAasxlvl97ptEorNP1fDo) but unfortunately i can't get CSS selectors right. I have also used CSS selector gadget. I also know html and css and i have read it myself. I think the css selectors are right but i just can't extract data for some reason.

   def parse(self, response):
      

        items = DenItem()
        all_div = response.css('div.collection-product')
       
        for div in all_div:
            product_name = div.css(".collection-product-name font font::text").extract()
            _new_price = div.css('div.collection-product-price > a > font > font::text').extract()  # .replace("Rs", "")
            _new_price = [s.replace("$", "") for s in _new_price]
            _new_price = [s.replace(",", "") for s in _new_price]
            _old_price = div.css("main#setembro section:nth-child(5) > div > div > div > div > ul > div.owl-wrapper-outer > div > div:nth-child(3) > li > div > div.collection-product-price-content > p.collection-product-price > del > font > font::text").extract()  # .replace("Rs", "")
            _old_price = [n.replace("R $", "") for n in _old_price]
            _old_price = [n.replace(",", "") for n in _old_price]
            items['product_name'] = product_name
            items['_new_price'] = _new_price
            items['_old_price'] = _old_price
            if len(items['_new_price']) == 0:
                items['_new_price'] = '0'
            if len(items['_old_price']) == 0:
                items['_old_price'] = '0'

            yield items
3
  • Share the code you have so far, without it you're just asking for someone to do it for you. Commented Sep 25, 2019 at 12:36
  • I have shared it Commented Sep 25, 2019 at 12:40
  • which products ? There are lots. Also, div.collection-product produces no results for me so you wouldn't be performing a loop. Commented Sep 25, 2019 at 12:53

1 Answer 1

1

I find content dynamically returned from another url. You can find this in the network tab when refreshing the page with F5.

import requests

r = requests.get('https://dentalspeed.com/vitrines/app-vitrine__home--estetica').json()
print(r)

Depending on full list of products you want (you may need to track other urls)

Sign up to request clarification or add additional context in comments.

2 Comments

i don't get it?
products and prices are dynamically loaded. For example, Escolha Ofertas Por Especialidade, the info for these products is loaded from the url I show above. When you use a browser it makes additional requests (not just the url you start with) for information to other uris. Using requests these additional requests are not captured. You can however, use the browser network tab to find out what these other requests are and then use requests to make xhr calls to those uris.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.