I am a new python learner; almost 3 weeks old.
I am trying to automate some daily tasks by using python. In here, I was trying to scrape a website which is "https://www.germaneveryday.com/", It does generate a new German word every day along with a sentence example. So my plan was to automate this instead of visiting the site everyday.
I followed an online tutorial from here : http://docs.python-guide.org/en/latest/scenarios/scrape/
And this is the code:
from lxml import html
import requests
page = requests.get('https://www.germaneveryday.com/')
tree = html.fromstring(page.content)
Word = tree.xpath('//*[@id="main"]/div[1]/div[2]/div/h1/a')
print (Word)
I did inspect the daily word on the website, and using right click, copy xpath to extract the "tree.xpath" address for the specific html data I am willing to get out and print in my simple code using lxml + python.
Except that every time the output is either an empty parenthesis such as : [] or it is some html block that is meaningless As shown here : https://i.sstatic.net/dAjB6.png
My question is that, what is wrong here is it the xpath address or maybe the website has some kind of a layer over the html ?
(Excuse my ignorance using some descriptions such as : layer or address of xpath )
My System Info:
- Windows 7 (x86)
- Python Version is (v3.6.5)
- Web Browser is Chrome 66.0.3359.181