3

I want to make a script in python that interacts with a webpage that has quite a lot of javascript in it (it's a webpage that computes a bunch of physics stuff).

I don't want my code to break if the page formatting changes and I want it to run offline so I would prefer my script to run on a local html copy of the page I got (all the JS code is accessible in the HTML source, there is no call to an external server). I wanted to use the requests library to do it, but it only works with URLs. Is there any library to do this? Note that I want to interact with the HTML (input values and look at the outputs etc..), I know that I can parse the file but that's not what I'm asking. I'm also totally new to web bots or anything related.

Right now I can open my .html version of the page offline with chrome and interact with it, so there has to be a way to automate this somehow. I'm also not against using something else than python if there is a better library for this in another language.

5
  • 1
    Try selenium. It helps in parsing JavaScript enabled HTML content. Commented Nov 15, 2020 at 5:06
  • Requests won’t retrieve from a local file system. You could very very easily serve the page locally using http.server in which case requests could retrieve it, BUT why bother using Requests if the file is local anyway. Commented Nov 15, 2020 at 18:57
  • @barny Because there’s some pretty complicated JS code on the page every time I press a button that serves some result and I want to interact with it automatically and I haven’t found any other way to do that. If not Requests then what should I use? Understanding how the JS code works would take more time than just having a bot enter a value press the button and retrieve the result. Commented Nov 15, 2020 at 19:06
  • 1
    So yes you need a browser-simulaion such as Selenium. Requests can make HTTP GET requests but a browser is needed to interpret html+JS Commented Nov 15, 2020 at 19:31
  • Maybe request-html + a local httpserver? See stackoverflow.com/questions/54889023/… Personally, I avoid Selenium with a passion, but Cypress IO which I know for QA isn’t suited for automation. Commented Nov 15, 2020 at 22:52

1 Answer 1

0

interesting question, best way I can think to do that is use a web framework and then just scrape the data using requests. I am familiar with flask and its simple to use but im sure there are other options as well

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.