1

I am Trying to Get Html Content from a URL using request.get in Python. But am getting incomplete response.

import requests
from lxml import html


url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 
    (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
    'Content-Type': 'text/html',
    }

response = requests.get(url, headers=headers)
print response.content

Can any one suggest the changes to be done for getting the exact complete response.

NB:using selenium am able to get the complete response,but that is not the recommended way.

4
  • 2
    What is missing? I guess that the site uses JavaScript to change the page in the browser. Requests only fetches the raw HTML, it does not execute JavaScript. Commented Apr 18, 2018 at 9:26
  • @LutzHorn Thanks for the Reply,By Inspecting Elements in the web page iam able to see all the html elements,so why is it not coming in the htmll response? Commented Apr 18, 2018 at 9:33
  • Without JavaScript this is how the page looks. This probably is not what you expect. Commented Apr 18, 2018 at 9:35
  • Inspect element shows the content after execution of javascript. Use viewsource on the browser and see what you get. You will see very few lines which in turn call javascript to populate the webpage. Commented Apr 18, 2018 at 9:35

1 Answer 1

6

If you need to get content generated dynamically by JavaScript and you don't want to use Selenium, you can try requests-html tool that supports JavaScript:

from requests_html import HTMLSession

session = HTMLSession()
url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
r = session.get(url)
r.html.render()

print(r.content)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the Reply.. The Result after rendering the html result is even same as that of first html content..the javascript rendered content is not fetched there..i tried even giving some time delay for render..is ther any more things to be done over here? @Andersson

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.