0

Hi I am still a beginner at python and I was experimenting.

I am looking for a way to request a url and get the data of the webpage so the page does not need to open.

Once I get the data, I need to search the data for a tag, for example, if it has 'hello' somewhere on the home page that is requested.

Here is an example:

import urllib.request
fp = urllib.request.urlopen("http://www.python.org")
mybytes = fp.read()

mystr = mybytes.decode("utf8")
fp.close()

x = mystr.find('testing word tag');

print(x)

Please bear with me as I am still a rookie and can't find an example of what I am looking for.

^ found this code on here but it does not seem to work to find a string.

Anyone knows the best way to do it?

Thank you guys :)

1
  • 1
    Web-scrapping de-facto use BeautifulSoup Commented Aug 25, 2020 at 11:24

3 Answers 3

1

Here are the most used libraries for this kind of work:

Requests to get the HTML of the page.

BeautifulSoup to find elements (and much more)

$ pip install requests bs4

And in your favorite IDE:

import requests
from bs4 import BeautifulSoup

r = requests.get("http://www.python.org")
soup = BeautifulSoup(r.content, "html.parser")

sometag = soup.find("sometag")
print(sometag)
Sign up to request clarification or add additional context in comments.

1 Comment

Or simply as an extension for OP's existing script: soup = BeautifulSoup(mystr); soup.find("sometag")
0

Try this.

import requests
url = "https://stackoverflow.com/questions/63577634/extract-html-and-search-in-python"

res = requests.get(url)
print(res.text)

2 Comments

How does this answer the question ?
You get the html of that webpage. If you want to extract tags easier you could use BeautifulSoup.
0

Another method.

from simplified_scrapy import SimplifiedDoc,req
html = req.get('https://www.python.org')
doc = SimplifiedDoc(html)
title = doc.getElement('title').text
print (title)
title = doc.getElementByText('Welcome to', tag='title').text
print (title)

Result:

Welcome to Python.org
Welcome to Python.org

Here are more examples: https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.