Here is a solution using Selenium and Firefox:
- Open a browser window and navigating to the url
- Waiting till the link for practice appears
- Extracting all span elements that hold part of the text
- Create the output string. In case the first word has only one letter there will be only 2 span elements. If the word has more than one letter there will be 3 span elements.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
url = 'http://play.typeracer.com/'
browser = webdriver.Firefox()
browser.get(url)
try: # waiting till link is loaded
element = WebDriverWait(browser, 30).until(
EC.presence_of_element_located((By.LINK_TEXT, 'Practice')))
finally: # link loaded -> click it
element.click()
try: # wait till text is loaded
WebDriverWait(browser, 30).until(
EC.presence_of_element_located((By.XPATH, '//span[@unselectable="on"]')))
finally: # extract text
spans = browser.find_elements_by_xpath('//span[@unselectable="on"]')
if len(spans) == 2: # first word has only one letter
text = f'{spans[0].text} {spans[1].text}'
elif len(spans) == 3: # first word has more than one letter
text = f'{spans[0].text}{spans[1].text} {spans[2].text}'
else:
text = ' '.join([span.text for span in spans])
print('special case that is not handled yet: {text}')
print(text)
>>> 'Scissors cuts paper. Paper covers rock. Rock crushes lizard. Lizard poisons Spock. Spock smashes scissors. Scissors decapitates lizard. Lizard eats paper. Paper disproves Spock. Spock vaporizes rock. And as it always has, rock crushes scissors.'
Update
Just in case you also want to automate the typing afterwards ;)
try:
txt_input = WebDriverWait(browser, 30).until(
EC.presence_of_element_located((By.XPATH,
'//input[@class="txtInput" and @autocorrect="off"]')))
finally:
for letter in text:
txt_input.send_keys(letter)
The reason for the try:... finally: ... blocks is that we have to wait till the content is loaded - which can sometimes take quite a bit.
requests-htmlpackage. It allows you to render a page before extracting data.