2

I am scraping a few pages with selenium, and I do not use other frameworks (like scrapy, etc..) because of a lot of ajax action. My problem is that the content is refreshing automatically nearly every second (like for example financial data) but I want to scrape all the elements in a static state. I searched alot in the internet and especially here on stackoverflow. WHat is the easiest way to freeze the website with selenium? I even tried switching off the wireless adapter but this was a problem... This is the only command in the selenium docs that I found:

driver.set_network_conditions(offline=True, latency=5, throughput=500 * 1024)

I tested this code and when i run the script it doesn't have any effect. The website is still "auto refreshing"...

4
  • Can you share the url you're trying to parse? Commented Feb 6, 2019 at 20:04
  • for example this one: gate hub.net/markets/XRP/USD+rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq (there is no API for this site) Commented Feb 6, 2019 at 20:10
  • What do you plan to extract from that page? Commented Feb 6, 2019 at 21:20
  • 1
    Is this that you need? api.gatehub.net/rippledata/v2/exchanges/… You can increase the limit parameter if needed (tested max 400). Commented Feb 6, 2019 at 21:23

2 Answers 2

1

"for example this one: https://gatehub.net/markets/XRP/USD+rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq (there is no API for this site)"


In fact, an api exists, but it isn't fully public.

To get the values of the chart as a json object, you'll need to construct a customized URL, something like:

https://api.gatehub.net/rippledata/v2/exchanges/USD+rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq/XRP?descending=true&end=2019-02-06T21:20:00.000Z&limit=400&reduce=false&result=tesSUCCESS&start=2009-02-06T21:20:00.000Z

Output:

{"result":"success","count":400,"marker":"USD|rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq|XRP||20190206014150|000044926668|00006|00003","exchanges":[{"base_amount":"0.12180204","counter_amount":"0.42056","node_index":6,"rate":"3.4528157","tx_index":18,"autobridged_currency":"ETH","autobridged_issuer":"rcA8X3TVMST1n3CJeAdGk1RdRCHii7N2h","buyer":"rGmGFAEx1hYEJuSAfrjEBdA48AXWJBMp1D","executed_time":"2019-02-06T21:14:00Z","ledger_index":44945715,"offer_sequence":39832,"provider":"rGmGFAEx1hYEJuSAfrjEBdA48AXWJBMp1D","seller":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","taker":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","tx_hash":"4E39DB1CB68B4635E773082042B47168094852ED4A11C93AED7F85A67F1F7EDD","tx_type":"OfferCreate","base_currency":"USD","base_issuer":"rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq","counter_currency":"XRP"},{"base_amount":"322.8872040048709","counter_amount":"1109.37944","node_index":2,"rate":"3.4358111","tx_index":18,"autobridged_currency":"ETH","autobridged_issuer":"rcA8X3TVMST1n3CJeAdGk1RdRCHii7N2h","buyer":"rETx8GBiH6fxhTcfHM9fGeyShqxozyD3xe","executed_time":"2019-02-06T21:14:00Z","ledger_index":44945715,"offer_sequence":26918939,"provider":"rETx8GBiH6fxhTcfHM9fGeyShqxozyD3xe","seller":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","taker":"rUmnnszuTRfhKnULCjcKzV7mJeazCF7Gik","tx_hash":"4E39DB1CB68B4635E773082042B47168094852ED4A11C93AED7F85A67F1F7EDD","tx_type":"OfferCreate","base_currency":"USD","base_issuer":"rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq","counter_currency":"XRP"}

...

Notes:

  • You can change the limit parameter to display different number of records if needed (tested max 400)
  • Dates should also be automagically updated to get the latest values.
Sign up to request clarification or add additional context in comments.

2 Comments

thank you for your answer, this was helpful, although my question was how to stop the javascript / the autorefreshing the whole website/elements during a selenium driver session.
Try injecting a js error via selenium execute_script: throw new Error();
0

One solution might be to look into being able to set config preferences for whichever browser you are using for your driver. For example, if using Firefox you could set accessibility.blockautorefresh to False, and then just use driver.refresh() when you are ready.

https://lifehacker.com/disable-automatic-web-page-refreshing-5321420

PHPUnit + Selenium: How to set Firefox about:config options?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.