Parse HTTP headers in C++

Question

I am using curl to communicate with a server.

When I make a request for data I receive the HTTP headers followed by jpeg data separated by a boundary like so:

enter image description here

I need to parse out

The boundary string
The Content-Length.

I have copied the incoming data to a a char array like so:

static size_t OnReceiveData ( void * pvData, size_t tSize, size_t tCount, void * pvUser )
{
    printf("%*.*s", tSize * tCount, tSize * tCount, pvData);

    char* _data;
    if(pvData != nullptr && 0 != tCount)
    {
        _data = new char[tCount];
       memcpy(_data, pvData, tCount);
    }

    return ( tCount );
}

How can I best do this in C++?? How do I actually inspect and parse the _data array for the information that I want?? Are the any boost libraries that I can use for example??

An answer not using boost or anything would be highly appreciated. — Jonny
– Jonny, Commented Oct 29, 2015 at 1:08

Grigorii Chudnov · Accepted Answer · 2014-09-17 20:24:02Z

5

You could parse the headers on the fly or put them into a map and post-process later. Use find, substr methods from the std::string. Look at Boost String Algorithms Library, it contains lots of algorithms, e.g. trim

e.g. to place headers into the std::map and print them (rough cuts):

#include <stdlib.h>
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <boost/algorithm/string.hpp>

int main(int argc, char* argv[]) {
  const char* s = "HTTP/1.1 200 OK\r\n"
    "Content-Type: image/jpeg; charset=utf-8\r\n"
    "Content-Length: 19912\r\n\r\n";

  std::map<std::string, std::string> m;

  std::istringstream resp(s);
  std::string header;
  std::string::size_type index;
  while (std::getline(resp, header) && header != "\r") {
    index = header.find(':', 0);
    if(index != std::string::npos) {
      m.insert(std::make_pair(
        boost::algorithm::trim_copy(header.substr(0, index)), 
        boost::algorithm::trim_copy(header.substr(index + 1))
      ));
    }
  }

  for(auto& kv: m) {
    std::cout << "KEY: `" << kv.first << "`, VALUE: `" << kv.second << '`' << std::endl;
  }

  return EXIT_SUCCESS;
}

You will get the output:

KEY: `Content-Length`, VALUE: `19912`
KEY: `Content-Type`, VALUE: `image/jpeg; charset=utf-8`

Having the headers, you could extract the required ones for post-processing.

answered Sep 17, 2014 at 20:24

Grigorii Chudnov

3,1121 gold badge27 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Daniel Stenberg Over a year ago

As he uses libcurl, you know that he gets a single full header in every callback, which makes it much simpler to parse...

jvandenbroek · Accepted Answer · 2017-11-03 16:17:14Z

I would put all headers in a map, after which you can easily iterate through it. No boost needed. Here a basic working example with libcurl:

#include <iostream>
#include <string>
#include <map>
#include <curl/curl.h>

static size_t OnReceiveData (void * pData, size_t tSize, size_t tCount, void * pmUser)
{
    size_t length = tSize * tCount, index = 0;
    while (index < length)
    {
        unsigned char *temp = (unsigned char *)pData + index;
        if ((temp[0] == '\r') || (temp[0] == '\n'))
            break;
        index++;
    }

    std::string str((unsigned char*)pData, (unsigned char*)pData + index);
    std::map<std::string, std::string>* pmHeader = (std::map<std::string, std::string>*)pmUser;
    size_t pos = str.find(": ");
    if (pos != std::string::npos)
        pmHeader->insert(std::pair<std::string, std::string> (str.substr(0, pos), str.substr(pos + 2)));

    return (tCount);
}

int main(int argc, char* argv[])
{
    CURL *curl = curl_easy_init();
    if (!curl)
        return 1;

    std::map<std::string, std::string> mHeader;

    curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
    curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, OnReceiveData);
    curl_easy_setopt(curl, CURLOPT_HEADERDATA, &mHeader);
    curl_easy_setopt(curl, CURLOPT_NOBODY, true);
    curl_easy_perform(curl);
    curl_easy_cleanup(curl);

    std::map<std::string, std::string>::const_iterator itt;
    for (itt = mHeader.begin(); itt != mHeader.end(); itt++)
    {
        if (itt->first == "Content-Type" || itt->first == "Content-Length")
            std::cout << itt->first << ": " << itt->second << std::endl;
    }
}

Marshall Clow · Accepted Answer · 2014-09-18 18:18:31Z

1

The cpp-netlib project (based on boost) contains a full MIME parser (written with boost.spirit).

I'm not really that happy with the interface of the parser, but it works well.

answered Sep 18, 2014 at 18:18

Marshall Clow

16.8k2 gold badges36 silver badges54 bronze badges

Collectives™ on Stack Overflow

Parse HTTP headers in C++

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related