3

I'm writing an HTTP server in C. Note that I DO KNOW how to parse a string in an HTTP request format. This means I can easily parse out the headers after I've received them.

My struggle is the following:

The HTTP protocol is built on top of TCP sockets. And thus the request sent by client is not guaranteed to be delivered in one piece, after just one read() operation. Thus, I need to read the request up to the end of headers, get Content-Length, and the proceed to read() the body knowing how much data I should read.

I'm using nonblocking IO in case that matters to some readers.

For this, I have two ideas, each of which has its serious drawbacks.

  1. read() one byte at a time, checking if the end of the buffer is "\r\n\r\n" every time after each read(). Then get Content-Length and read the body. Highly inefficient due to the number of read() syscalls.

  2. read into a buffer in larger chunks, checking every time if the end of a request was read using strstr() to find "\r\n\r\n" substring. When substring "\r\n\r\n" was found, save the number of chars read that follow it in a variable n, get Content-Length. Proceed to read Content-Length - n chars. Also inefficient due to having to call strstr() after every read().

Any suggestions as to how to do it more efficiently?

IMPORTANT! I understand that the second approach is better. I'm looking for some new suggestions that are better than mine.

10
  • 3
    Go with the latter. Less system calls, and you have to check every byte you read anyway. Commented Nov 8, 2023 at 13:15
  • 1
    Do the latter. Typically, ppl use read buffer size equal to page size, 4k. Commented Nov 8, 2023 at 13:17
  • 1
    Do the latter but start with the strstr somewhere at the end of where the buffer was before reading the new part instead of starting each time at the beginning (len-3 in order to match 4 bytes, assuming the at least 1 new byte was needed). Commented Nov 8, 2023 at 13:29
  • 1
    Does this answer you question stackoverflow.com/questions/41286260/… Commented Nov 8, 2023 at 14:08
  • The latter, but cut the strstr part, there's no guarantee you'll get the end-of-header marker within a single read call; You might only read up to \r\n or something within one chunk. Commented Nov 8, 2023 at 14:21

3 Answers 3

2

Your question is about optimization

You ask about parsing HTTP headers, and you describe your method, which to sum up is:

  • You read in all the headers
  • You pass all the headers to a header parsing function

And then you explain your technique for finding the end of all the headers. And then you end with the question:

Any suggestions as to how to do it more efficiently?

With no specific restriction on how to do it more efficiently.

You need to perform profiling before optimizing

You need to measure the performance of your software and identify the hot spots in your code. E.g., code that takes longer than you expect.

For the sake of argument, let us assume you are efficiently reading in socket data in chunks to create a potential HTTP message header block, and header parsing turns out to be a hot spot in your software.

Potential problems with your approach

If your header parsing turns out to be a bottleneck, and you believe it is related to scanning for the end of the headers, then you need a hypothesis for why it is a bottleneck at all.

Since you don't post any code, we are forced to speculated based on your general description. However, here are some guesses:

  • Inefficient scanning with strstr()

    This guess is based on the fact that you are treating your HTTP headers as C strings by using a string function. Therefore you have to nul terminate (add '\0' to the end of your header data) and then do a needle in the haystack search for "\r\n\r\n".

    You are sort of required to start from the beginning, because HTTP pipelining might stuff more than one request into a single received buffer. The comparison test code has to scan for the nul terminator before it knows it is done, so that makes the loop condition a little more complicated.

  • Possible O(n2) behavior

    Based on the assumption that profiling has shown the scanning to be a bottleneck, one cause may be that you are concatenating your newly received data against your previously received data so that you can do the strstr() call again to find the end of the headers, since you did not find it before.

    If you are using strcat() of your old buffer converted into a string with your new buffer converted into a string, then this causes a another scan of your old data to find the end of the old buffer in order to perform the concatenation.

    If you are starting your strstr() call from the beginning again after concatenation, this causes another scan of your old data to find the "\r\n\r\n" that you had already figured out was not there.

Resolving the hypothetical issues

Since we have no code, it is kind of silly to provide fixes to made up problems. However, for the sake of posterity, even if it doesn't apply to your problem, it is still helpful to give a complete answer.

  • You can just scan for '\n' instead and see if it is followed by an empty line.

    This simplifies the bulk of the scanning work to be a simple character comparison, and will have nice linear behavior.

  • Do not use strcat() to concatenate.

    Your header data should be copied into a buffer that is expected to hold all the headers most of the time (maybe something like 16KB). When concatenating, you should simply jump to the end of the what was previously scanned, and copy the newly read data from that point.

  • Do not scan for the end of headers from the beginning.

    Instead, after concatenation is done, start your scan from the point you had left off, which would be from the beginning of the newly read/copied data.

Don't parse twice

If you have removed the above issues, then after profiling, you should see reasonable good performance from your header parsing.

However, it is still possible to be more efficient. Since the suggestion above is already going through the work of identifying the end of each header line, you could just build your header parser into that scanning loop.

After finding the \n, the previous character should be \r, so you know the length of the header line just scanned. Since you are just looking for \n now, you can use memchr() instead strstr(), which avoids the need to nul terminate your input.

If you stash the location of each end of line, you also have the beginning of the next header line.

You know you are done parsing headers when you reach an empty header line.

This allows you to parse the headers and find the end of the headers in a single scan of the input.

Don't copy data

Instead of performing data concatenation, you can just allocate a large buffer to represent the header block, and use that same large buffer for your recv() calls. This avoids the need to concatenate. Instead, you call recv() directly with the offset into the buffer starting at the end of the previous chunk read when you failed to find the end of the headers.

    offset = 0;
    bufsz = BUFSZ;
    while (NOT_END_OF_HEADERS) {
        if (bufsz > offset)
            n = recv(sock, buf + offset, bufsz - offset, 0);
        if (ERROR_OR_NEED_TO_STOP) HANDLE_IT;
        RESUME_PARSE(buf + offset, buf + offset + n);
        offset += n;
    }

Don't waste memory

Regular application programs normally don't worry too much about taking too much memory. They are relatively short lived programs, and so the memory used is relinquished back to the system relatively quickly.

However, long running programs on embedded system usually need to be much more miserly. So in addition to CPU profiling, the system will be memory profiled as well, and memory hogs will be scrutinized.

This is why embedded software will typically use buffer structures that link together smaller buffers to represent the streamed message rather than a contiguous large buffer. This is to allow better right-sizing of memory usage, and can avoid issues related to memory fragmentation.

The moral of the story

Optimizations should first be approached with measurement. When actually doing optimizations, the solutions will not always be straightforward. However, resolving a performance bottleneck can be very satisfying for a developer.

Sign up to request clarification or add additional context in comments.

5 Comments

I can parse out the headers all at once easily. I just need to read them all, which is a problem
Since you need to parse the headers, why scan the input twice?
Because it is way harder to do the parsing in chunks. I haven't manage to devise an effitient algorithm for that. For example, if all i get in a read() is Content-Len, I cannot put it inside my hashmap. I have to store it somewhere and insert in some time later in another call, when I receive enough data. At least I can't do that unless I modify my hashmap to the extent it no longer looks like a hashmap. And the performance of such structure will dagrade. But if you have some ideas as to how to perform parsing in chunks, I'd be glad to hear them
Then you have a new question about how to efficiently parse chunks of data as a stream. But, it is not all that different from how I described about storing locations. The definition of a location becomes a pointer to a chunk and an offset. If you have some partially parsed input, split your chunk into two at the beginning of the partial, placing your partial into the start of a new chunk.
You try to tweak your question away from the provided answers, but I think you have received all the help that can be had given how you have posed your question.
0

talking about content, I recomend read byte to byte, and lets the user define the maximum size, because an connection can close at any time, and read an fixed size, would disable file upload or larger content-len. if you want to take an look into my web server, you can copy what ever part you want in yours https://github.com/OUIsolutions/CWebStudio

the part I parse the http its request/request_parser CwebHttpRequest_parse_http_request

6 Comments

The problem with this approach is that read() is an expensive syscall. Any other positive effect you mentioned, as I understood it, can easily be handled by non-blocking IO since the OS likely won't let a read operation take minutes before blocking
Although verified multiple times, I can't be 100% sure about my latter statement. Never saw this mentioned in the manuals
Reading a fixed size does not behave as you claim. It blocks until some data is available and then returns that data, however much it is.
@Sgg8 That's exactly what I said. That's why this answer is incorrect.
@user207421 I thought you were addressing me when talking about "does not behave as you claim". It was a missunderstanding
|
0

You can have both. Read in large chunks to your buffer and read byte by byte from this buffer. You will have a minimum number of system calls and you will have the simplicity of the single char (or line) parser. You do not need to look for the substrings (ie call strstr).

6 Comments

How do I know when to stop reading using this approach?
@Sgg8 You keep the number of bytes read. Very simply. But I do not really understand your question.
I don't know the size of the request ahead of time
@Sgg8 You do not need to know. What do you need it for? Implementation of such FIFO buffer is very simple and you should do it as an exercise before trying to write web servers
I need to know the size in order to know when to stop reading and move on to writing a responce
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.