0

I am looking for a way to scrape data from here to a list. The data I want to extract is in

rangeSelector -> series -> data

It is a collection of the price of a specific item at a certain time. I need to get rid of all the javascript code except for the data. I will then try to use this data for plotting and calculations.

I am new to web-scraping and I am looking for a simple one-time solution. What would be the best way to approach this problem?

document.addEventListener('DOMContentLoaded', function () {
    var myChart = Highcharts.stockChart('stocks-container', {
        rangeSelector: {
            selected: 1
        },
        yAxis: [{
            labels: {
                align: 'left'
            },
            height: '80%',
            resize: {
                enabled: true
            }
        }, {
            labels: {
                align: 'left'
            },
            top: '80%',
            height: '20%',
            offset: 0
        }],
        plotOptions: {
            column: {
                stacking: 'normal'
            }
        },
        series: [
            {
                name: 'Unit Price (Buy)',
                data: JSON.parse("[[1585902517017,187893.6],[1585906117013,193975.7],[1585909717026,189253.9],[1585913317001,195890.9],[1585916917027,197659.8],[1585920516999,201482.1],[1585924117021,198212.5],[1585927716997,208305.0],[1585929517008,207305.0],[1585933117021,193561.7],[1585936716979,199070.6],[1585938517019,195450.9],[1585942117009,195527.4],[1585945717007,195877.6],
1

1 Answer 1

0

You can parse the data with re/json modules.

For example:

import re
import json
import requests


url = 'https://stonks.gg/products/search?input=Superior%20Fragment'
html_data = requests.get(url).text

d1 = json.loads(re.search(r'Unit Price \(Buy\).*?(\[\[.*?\]\])', html_data, flags=re.S).group(1))
d2 = json.loads(re.search(r'Unit Price \(Sell\).*?(\[\[.*?\]\])', html_data, flags=re.S).group(1))
d3 = json.loads(re.search(r'Instant Buy Volume.*?(\[\[.*?\]\])', html_data, flags=re.S).group(1))
d4 = json.loads(re.search(r'Instant Sell Volume.*?(\[\[.*?\]\])', html_data, flags=re.S).group(1))

print(d1)
print(d2)
print(d3)
print(d4)

Prints:

[[1585902517017, 187893.6], [1585906117013, 193975.7], [1585909717026, 189253.9], [1585913317001, 195890.9], [1585916917027, 197659.8], [1585920516999, 201482.1], [1585924117021, 198212.5], [1585927716997, 208305.0], [1585929517008, 207305.0], [1585933117021, 193561.7], [1585936716979, 199070.6], [1585938517019, 195450.9], [1585942117009, 195527.4], [1585945717007, 195877.6], [1585949317016, 198097.6], [1585952917006, 200590.3], [1585956517023, 198363.7], [1585958317074, 193681.3], [1585961917009, 199628.0], [1585967317017, 197546.9], [1585969117024, 195719.5], [1585972716979, 198053.2], [1585974516979, 195370.3], [1585976317029, 194257.0], [1585979917012, 195980.4], [1585981717045, 199915.4], [1585985316979, 199097.0], [1585987117024, 199425.4], [1585990717024, 198317.1], [1585994316979, 207382.3], [1585996117030, 199845.9], [1585999717009, 200711.5], 

...and so on.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.