Incomplete HTML Content Using Python request.get

Question

I am Trying to Get Html Content from a URL using request.get in Python. But am getting incomplete response.

import requests
from lxml import html


url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 
    (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
    'Content-Type': 'text/html',
    }

response = requests.get(url, headers=headers)
print response.content

Can any one suggest the changes to be done for getting the exact complete response.

NB:using selenium am able to get the complete response,but that is not the recommended way.

What is missing? I guess that the site uses JavaScript to change the page in the browser. Requests only fetches the raw HTML, it does not execute JavaScript. — user9455968
– user9455968, Commented Apr 18, 2018 at 9:26
@LutzHorn Thanks for the Reply,By Inspecting Elements in the web page iam able to see all the html elements,so why is it not coming in the htmll response? — Suhail Moideen
– Suhail Moideen, Commented Apr 18, 2018 at 9:33
Without JavaScript this is how the page looks. This probably is not what you expect. — user9455968
– user9455968, Commented Apr 18, 2018 at 9:35
Inspect element shows the content after execution of javascript. Use viewsource on the browser and see what you get. You will see very few lines which in turn call javascript to populate the webpage. — vishal
– vishal, Commented Apr 18, 2018 at 9:35

Andersson · Accepted Answer · 2018-04-18 09:37:48Z

6

If you need to get content generated dynamically by JavaScript and you don't want to use Selenium, you can try requests-html tool that supports JavaScript:

from requests_html import HTMLSession

session = HTMLSession()
url = "https://www.expedia.com/Hotel-Search?destination=Maldives&latLong=3.480528%2C73.192127&regionId=109&startDate=04%2F20%2F2018&endDate=04%2F21%2F2018&rooms=1&_xpid=11905%7C1&adults=2"
r = session.get(url)
r.html.render()

print(r.content)

answered Apr 18, 2018 at 9:37

Andersson

52.8k18 gold badges83 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Suhail Moideen Over a year ago

Thanks for the Reply.. The Result after rendering the html result is even same as that of first html content..the javascript rendered content is not fetched there..i tried even giving some time delay for render..is ther any more things to be done over here? @Andersson

Collectives™ on Stack Overflow

Incomplete HTML Content Using Python request.get

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related