1

I am new to python and managed to write a little program (using python3) to retrieve information from a website. I have two problems:

  1. I do not know how to tell python to wait each 80th step, so when i = 80, 160, 240 etc.
  2. I do not know how to tell python to retrieve the information from the website how many steps exist in total (as this varies from page to page), see image below. I can see in the picture that the maximum amount of 260 is "hard-coded" in this example? Can I tell python to retrieve the 260 by itself (or any other number if this changes on another web page)?
  3. How can I tell python to check which is the current page the script starts, so that it can adjust i to the page`s number? Normally I presume to start at page 0 (i = 0), but for example, if I were to start at page 30, my script shall be able to make i = 30 or if I start at 200, it shall be able to adjust i = 200 etc before it goes to the while loop.

Is it clear what I am troubling with?

enter image description here

This is the pseudo code:

import time
from selenium import webdriver

url = input('Please, enter url: ')

driver = webdriver.Firefox()
driver.get(url)

i = 0

while i > 260: # how to determine (book 1 = 260 / book 2 = 500)?
    # do something
    if i == 80: # each 80th page?
        # pause
    else:
    # do something else
    i = i + 1
else:
    quit()
2
  • 1
    Can you explain your 3th question ? Commented Apr 22, 2017 at 7:11
  • I edited my 3rd question. I hope I have been more clear now. Sometimes it is hard to explain what one wants. ;) I have right now few time on my side, but I will check your answers later today. Thank you for answering you all! Commented Apr 22, 2017 at 9:56

2 Answers 2

1

1) sleep

import time
....     
    if i % 80 == 0: # each 80th page?
        # Wait for 5 seconds
        time.sleep(5)

2) element selectors

html = driver.find_element_by_css_selector('afterInput').get_attribute('innerHTML')

3) arguments

import sys
....
currentPage = sys.argv[2]

or extract it from the source (see 2)

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much for answering, user3804188. Unfortunately, the element selector does not function, it gives me the following error message selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: afterInput.
This means that your element you are looking for ist not there. Check your the html souce code (driver.page_source).
1

First, if you want to know if your i is "step"(devision) of 80 you can use the modulo sign, and check if it equal to 0, for instance:

if i % 80 == 0:
    time.sleep(1) # One second

Second, you need to query the html you receive from the server, for instance:

from selenium import webdriver

url = input('Please, enter url: ')

driver = webdriver.Firefox()
driver.get(url)
total_pages = driver.find_element_by_css_selector('afterInput').get_attribute('innerHTML').split()[1]  # Take only the number

after your edit: All you have to do is to is to assign i with this value you want by defining a variable in your script/parsing the arguments from the command line/scrape it from the website. This is Depends on your implementation and needs.

Other notes

I know you're on your beginning steps, but if you want to improve your code and make it a bit more pythonic I would do the following changes:

  • Using while and i = i + 1 is not a common pattern in python, instead use for i in range(total_pages) - of course you need to know the number of pages (from your second question)
  • There is no need to call quit(), your script will end anyway in the end of the file.
    • I think you meant while i < 260.

6 Comments

Thank you very much for your answer, Or Duan. What do I write instead of quit()? Nothing?
That right, there is no point to call it in the end, the script will be ended anyway.
Is while i < 260 the same as for i in range(total_pages)? Unfortunately, the total_pages does not function. :/ It tells me selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: afterInput.
Do I understand it correct that if i % 80 == 0: does do something each 80th step?
The 2 loops in your case are the same, the later is more "pythonic" and the total_page is just a number(let's say 260). Your error is a different one, you might read this to understand how you wait for element to be present. For i % 80 please read this.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.