0

I need to extract some data from a website, I found that all I need is exist in <script> element, So I extracted them with this command:

script = response.css('[id="server-side-container"] script::text').get()

And this is the value of script:

    window.bottomBlockAreaHtml = '';
    ...
    window.searchQuery = '';
    window.searchResult = {
  "stats": {...},
  "products": {...},
  ...
  };
    window.routedFilter = '';
  ...
    window.searchContent = '';

What is the best way to get the value of "products" in my python code?

8
  • what is the datatype of script variable? Commented Jan 6, 2023 at 23:17
  • @KazimRaza It's str Commented Jan 6, 2023 at 23:18
  • have you tried converting it into a dictionary? Commented Jan 6, 2023 at 23:26
  • 1
    extracting the text with regular expressions and then using json.loads() Commented Jan 7, 2023 at 0:10
  • 2
    you can craft a regex pattern that specific data in the value of the products field. Post the url and I can show you an example Commented Jan 7, 2023 at 1:44

1 Answer 1

1

In your example the best strategy would be to use regex to extract the value of the window.searchResults using regex. Then convert it to a dictionary using json.loads(), and then getting the value from the "products" key of the dictionary.

For example.

import json
import scrapy
import re

class LoplabbetSpider(scrapy.Spider):

    name = "loplabbet"
    start_urls = ["https://www.loplabbet.se/lopning/"]
    pattern = re.compile(r'window\.searchResult = (\{.*?\});', flags=re.DOTALL)

    def parse(self, response):
        for script in response.css("script").getall():
            matches = self.pattern.findall(script)
            if matches:
                results = json.loads(matches[0])
                product = results["products"]
                yield product
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.