Check the URL input is active in python script

Question

I have here a python web scraping tool script, I need to validate the url if its an existing website by testing connectivity to the website. Can anyone help me to implement this in my code?

Here's my code:

import sys, urllib

while True:
    try:
        url= raw_input('Please input address: ')
        webpage=urllib.urlopen(url)
        print 'Web address is valid'
        break
    except:
        print 'No input or wrong url format usage: http://wwww.domainname.com/ '
        print 'Please try again'
def wget(webpage):
        print '[*] Fetching webpage...\n'
        page = webpage.read()
        return page      
def main():
    sys.argv.append(webpage)
    if len(sys.argv) != 2:
        print '[-] Usage: webpage_get URL'
        return
    print wget(sys.argv[1])

if __name__ == '__main__':
    main()

EDIT: I have a code here that I extracted from another stackoverflow post. This code works and I just want it to integrate to my code. I have tried to integrate myself but get errors instead. Can anyone help me do this? Here's the code:

from urllib2 import Request, urlopen, URLError
req = Request('http://jfvbhsjdfvbs.com')
try:
    response = urlopen(req)
except URLError, e:
    if hasattr(e, 'reason'):
        print 'We failed to reach a server.'
        print 'Reason: ', e.reason
    elif hasattr(e, 'code'):
        print 'The server couldn\'t fulfill the request.'
        print 'Error code: ', e.code
else:
    print 'URL is good!'

Looks nice, only that your while True is executed before you call main. — Hyperboreus
– Hyperboreus, Commented Dec 10, 2013 at 17:22
yes that's what I need but i dont know how to implement it in my code. So im asking for help if anyone can help me do this — user3034404
– user3034404, Commented Dec 10, 2013 at 17:29
@user3034404 A python script is execute top to bottom, in your case 1. your while with its suite, then two defs (adding the functions to the scope) and then the condition which maybe invokes main. By this order, your while is executed first and your main last in case the condition holds. — Hyperboreus
– Hyperboreus, Commented Dec 10, 2013 at 17:58

Hyperboreus · Accepted Answer · 2013-12-10 18:01:29Z

1

Maybe this snippet helps you to understand why your main is executed after the while:

print 'Checkpoint Alpha'

while True:
    print 'Checkpoint Bravo'
    if raw_input ('x for break: ') == 'x': break

print 'Checkpoint Charlie'

def main():
    print 'Checkpoint Foxtrott'

print 'Checkpoint Delta'

if __name__ == '__main__':
    print 'Checkpoint Echo'
    main()
    print 'Checkpoint Golf'

print 'Checkpoint Hotel'

answered Dec 10, 2013 at 18:01

Hyperboreus

32.5k9 gold badges50 silver badges88 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Hyperboreus Over a year ago

@KDawG You can take the officer out of the Air Force, but you can't take the Air Force out of the officer. Tally Ho!

Arovit · Accepted Answer · 2013-12-10 17:30:23Z

0

Following should help you -

visited = []

in while loop - 
in try:
    url= raw_input('Please input address: ')
    if url in visited: 
        print "Already visited. Continue"
    visited.append(url)
    webpage=urllib.urlopen(url)
    [...]

answered Dec 10, 2013 at 17:30

Arovit

3,7595 gold badges22 silver badges24 bronze badges

1 Comment

user3034404 Over a year ago

I dont think this is what I need. I need a code that will check the connectivity to the given URL by the user

Collectives™ on Stack Overflow

Check the URL input is active in python script

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related