2

i want to do some basic url validation,and if url is invalid,request should not be proceed unless user have entered a valid one. Update: To be more clear I do not want the browser to be opened and image counter scipt to be runed unless the entered Url is valid.

import time 
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

user_url = input('Please enter a valid url:')
driver = webdriver.Chrome('/home/m/Desktop/chromedriver')
driver.get(user_url)
HEADERS = {'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36', 'accept': '*/*'}

time.sleep(8)

imagecounter = driver.find_elements_by_css_selector('img')

print('Number of HTML image tags:')
print(len(imagecounter))

Could you please modify the code and explain what is happening? I have tried with some libraries, but i think because of my poor coding skills there was no luck.

2
  • You need to define what is "valid" and what is "invalid" Commented May 26, 2020 at 14:15
  • I would suggest to first validate the syntax with pythons urlparse, then do the sleep(8), then validate the url response code, then find elements. Commented May 26, 2020 at 14:21

2 Answers 2

2

You can use requests to get the HTTP status code

    import requests
    import time 
    from selenium import webdriver
    from selenium.webdriver.common.keys import Keys

    user_url = input('Please enter a valid url:')

    # send a get request to the page, and if the status code is not OK
    # ask for a different url
    def valid_url(url):
        try:
            req = requests.get(url)
            while req.status_code != requests.codes['ok']:
                  return valid_url(input('Please enter a valid url:'))
        except Exception as ex:
            print(f'Something went wrong: {ex}')
            print('Try again!')
            return valid_url(input('Please enter a valid url:'))


        return url

    url = valid_url(user_url)
    driver = webdriver.Chrome()
    driver.get(url) # funtion is called here
    HEADERS = {'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36', 'accept': '*/*'}

    time.sleep(8)

    imagecounter = driver.find_elements_by_css_selector('img')

    print('Number of HTML image tags:')
    print(len(imagecounter))
Sign up to request clarification or add additional context in comments.

10 Comments

I am getting error if I input something like: "something.com". requests.exceptions.MissingSchema: Invalid URL 'something.com': No schema supplied. Perhaps you meant something.com?
If i input invalid URL like https:// stackkkkkkkkkkoverflow.com I get error: Failed to establish a new connection: [Errno -2] Name or service not known
In that case, you can add a try-except. I edited the answer
Thanks for the edit, Unfortunately I am still getting errors: If I enter "sometext" script prompts me to input again,but it chrashes if i put valid url after it.
And if i 1st put invalid url like https:// stackover888888flow.com/ it promts to input the valid url again, but after that if valid url is entered the script does not return all the image tags.
|
0

To validate a user provided url before proceeding you can use Python's module to check the request status andyou can use the following solution:

  • Code Block:

    from selenium import webdriver
    import requests
    
    while True:
        user_url = str(input("Please enter a valid url:"))
        req = requests.get(user_url)
        if req.status_code != requests.codes['ok']:
            print("Not a valid url, please try again...")
            continue
        else:
            break
    print("URL was a valid one... Continuing...")
    driver = webdriver.Chrome(executable_path=r'C:\WebDrivers\chromedriver.exe')
    driver.get(user_url)
    # perform your rest of the tasks
    
  • Console Output:

    Please enter a valid url:https://www.goodday.com
    Not a valid url, please try again...
    Please enter a valid url:https://www.goodday.com
    Not a valid url, please try again...
    Please enter a valid url:https://www.goodday.com
    Not a valid url, please try again...
    Please enter a valid url:https://www.google.com
    URL was a valid one... Continuing...
    
    DevTools listening on ws://127.0.0.1:54638/devtools/browser/975e0993-166a-4144-a05f-dcfb1d9b29a2
    

Reference

You can find a couple of relevant discussions in:

3 Comments

Thank you, but, unfortunately it does not work if I enter something like: ''sometext345''
@ArashIzmirov You are asking the user to provide a valid url and an url is normally made up of three or four components e.g. a) A scheme.b) A host c) A path d) A query string. The input you are testing i.e. sometext345 isn't an url but a string. Seethe example I have used.
Thanks for your effort and explaining, you have helped a lot.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.