13

I am looking for a library or function call in python or an associated library that would let me feed in a raw stream of text data representing an HTTP req/res and that would spit out that information is some sort of meaningful form like a dictionary or list. I do not want to use some built in class or create a bunch of new objects, in my program I am receiving in some raw data and that is just what I've got to work with. Is there already a solution out there for this, or do I have to write an HTTP parser myself?

Edit: Let me clarify what exactly I'm looking to do. I'm looking for something that would take a string like:

GET /index.html HTTP/1.1 \r\n
Host:www.stackoverflow.com \r\n
User-Agent:Firefox \r\n
etc.

And send me back something encapsulating the method, HTTP version, headers and all the rest.

1

3 Answers 3

4

There is a pure python HTTP parser that is shipped as a fallback implementation for the C/Cython optimized implementation of the http-parser project.

Here is the pure python version:

Here the source of the C version and Cython wrapper:

Sign up to request clarification or add additional context in comments.

Comments

1

http://docs.python.org/library/httplib.html I believe this is the library you are looking for. A little change in name for python 3 but otherwise good to go.

2 Comments

I looked at that but could not quite find what I needed. Correct me if I'm wrong, but doesn't that lib revolve around actually making/receiving requests? I don't want to make/receive any requests, I just want to look at raw data. Could you give an example of the method you believe would do this?
Well the http request, when you recieve it contains the raw header data, and you use this library to create a header dictionary. This is what your post describes. If you are looking to recieve raw text data over a socket you might try docs.python.org/library/socket.html but you will be recreating a lot of wheel parts. Conversely if you are receiving the raw text and want a way to parse it into a valid request header you can try deron.meranda.us/python/httpheader/… but I have not tried this myself.
1

I'd start by looking at WebOb. I think the cgi module in the standard library also has an HTTP parser.

3 Comments

Sweet, webob.Request.accept handles this perfectly: pythonpaste.org/webob/reference.html#accept-headers
@Wahnfrieden — I am confused, though, about how to get a raw HTTP request inside of a string, like is shown in the question, and turn it into a WebOb object. I do not see anything in your link that suggests that it is possible. Could you share how you turn HTTP request strings into WebOb objects? (Because I need to on one of my projects!) :)
@Brandon sorry I commented prematurely - WebOb parses the part of the header that I needed (just the value of the Accept header), but I don't know about the rest of it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.