Python http get - cannot replicate a curl request with headers

Question

I have the following curl command:

curl -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" -H "Connection: keep-alive" -X GET http://example.com/en/number/111555000

Unfortunately I was not able to replicate it... I tried with:

   url = http://example.com/en/number/111555000
   headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Connection':'keep-alive',}
   req = urllib2.Request(url, None, headers)
   resp = urllib2.urlopen(req)
   print resp.read()

but the server recognized some how that the request is "fake" and forwards me to google (reply from server is: HTTP/1.1 301 Moved Permanently). With curl instead I receive the original page.

Any ideas or suggestions? Thank you dk

EDIT: some additional infos:

$ nc example.com 80 
GET /en/number/111555000 HTTP/1.1
Host: example.com

HTTP/1.1 301 Moved Permanently
Date: Fri, 29 May 2015 18:51:05 GMT
Server: Apache
X-Powered-By: PHP/5.5.24
Location: http://www.google.de
Content-Length: 0
Content-Type: text/html


$ nc example.com 80 
GET /en/number/111555000 HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Connection: keep-alive

HTTP/1.1 200 OK
Date: Fri, 29 May 2015 18:57:56 GMT
Server: Apache
X-Powered-By: PHP/5.5.24
Set-Cookie: session=a%3A4%3A%7Bs...
Set-Cookie: session=a%3A4%3A%7Bs...
Keep-Alive: timeout=2, max=200
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html; charset=UTF-8

1c6f8
<!DOCTYPE html>
[...]

with curl:

$curl -X GET http://example.com/en/number/111555000
$

$ curl -H "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" -H "Connection: keep-alive" -X GET http://example.com/en/number/111555000
<!DOCTYPE html>
[...]

What happens if you use curl without headers? Are you sure the server accepts headers like that? — Rcynic
– Rcynic, Commented May 29, 2015 at 18:42
Nothing happens, no answer: $ curl -X GET http://example.com/en/number/111555000 $ — d82k
– d82k, Commented May 29, 2015 at 18:49

Community · Accepted Answer · 2017-05-23 10:28:27Z

2

I can get it to work with the requests library. Which is probably better to use.

import requests
url = "http://example.com/en/number/111555000"
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0', 'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'Connection':'keep-alive',}
req = requests.get(url, headers=headers)
req.text

here is the requests library documentation

Hope it helps.

edited May 23, 2017 at 10:28

CommunityBot

11 silver badge

answered May 29, 2015 at 19:02

Rcynic

3923 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Python http get - cannot replicate a curl request with headers

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related