As you have mentioned it rightly this is "Web Scraping" and python has amazing modules for the same. Most obvious one is -> BeautifulSoup
So, to get the info from your webpage,
- you would need to first understand the structure of the webpage.
- Also, in some cases this might not be fully legal
- the bigger challenge is, does the webpage support scraping
- this can be figured out by looking at the source of the webpage.
- if the text/info you want to grab is viewable in the source or in one of the hrefs, then it should be possible to scrape it using Beautifulsoup.
Solution -
- Before you arrive at a solution you must understand the HTML structure and the ways in which you can identify any element on a webpage
there are many ways, like
- using the "id" of any element on the webpage
- using the class or tagname directly
- using the xpath of the element
- or also, a combination of any o all of the above
once you reach this point, by now it must be clear for you on the way we are gonna proceed further on
#make a request to the webpage, and grab the html respone
page = requests.get("your url here").content
#pass it on to beautifulsoup
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.content, 'html.parser')
#Depending on how you want to find, you can use findbyclass, findbytag, and #other methods
soup.findAll('your tag')