0

I have a script which reads data from a webpage using HTMLParser:

import urllib
from HTMLParser import HTMLParser
import re


class get_HTML_Info(HTMLParser):
    def handle_data(self, data):
        print data


adib = urllib.urlopen('http://www.bulldoghax.com/secret/spinner')
htmlsource = adib.read()
adib.close()

parser = get_HTML_Info()
parser.feed(str(htmlsource))

I end up with two set of data like this:

bulldoghax

8530330882

In the terminal, I just want to extract only that number and set it to a string in python.

2 Answers 2

2

Use Beautiful Soup for scraping data.

pip install BeautifulSoup

import urllib
from HTMLParser import HTMLParser
import re

adib = urllib.urlopen('http://www.bulldoghax.com/secret/spinner')

htmlsource = adib.read()

from bs4 import BeautifulSoup
soup = BeautifulSoup(htmlsource)
for each_div in soup.findAll('div',{'class':'number'}):
    print each_div.text
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you!, that's perfect!, I just had to change soup = BeautifulSoup(htmlsource) to soup = BeautifulSoup(htmlsource, "lxml") because it gave me an error the first time i tried it
@himanshu_dua can you help me with writing a code which sends that number a cookie value for this website http://www.bulldoghax.com/secret/codes
1

Simple, here:

n="".join(filter(str.isdigit, data))

It filters the string based on being a number or not, then joins it into a string.

1 Comment

Thank you, now it's only showing the numbers, is there anyway I can remove the '\n' new line things, I just want the output to be that number

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.