0

when i access www.sampleweb.com/reg/ i has an input values like .

<input id="input-id" class="input-class" name="myinput" type="text" value="hello world">

how can i get the hello world value of www.sampleweb.com/reg/'s input using python?

i think in accessing the www.sampleweb.com/reg/ is like this :

url = 'http://www.sampleweb.com/reg/'
urlopen(url)

is this correct in accessing the url?

can anyone can help me about my case?

thanks in advance ...

3 Answers 3

1

You can use the library called BeautifulSoup

Sign up to request clarification or add additional context in comments.

Comments

1

You should parse html after gwetting it via urllib (as you mentioned) using any python html parser. For example, using BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html#find%28name,%20attrs,%20recursive,%20text,%20**kwargs%29

In your case something like this:

soup = BeautifulSoup(html)
inputs=soup.find("input", {"id": "input-id"})
print inputs[0]['value']

Comments

0

Please note that using a DOM Parser is the best option to parse out the HTML of any resource.

However if "hello world" is the only thing you want out the HTML, then the quick and dirty approach would be:

toFind = '<input id="input-id" class="input-class" name="myinput" type="text" value="'
htmlStr = urllib.urlopen('yoururl.com/your/path').read()
value = htmlStr[htmlStr.index(toFind)+len(toFind):]
value = htmlStr[:htmlStr.index('\"')]
print value

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.