5

I am using curl to communicate with a server.

When I make a request for data I receive the HTTP headers followed by jpeg data separated by a boundary like so:

enter image description here

I need to parse out

  1. The boundary string
  2. The Content-Length.

I have copied the incoming data to a a char array like so:

static size_t OnReceiveData ( void * pvData, size_t tSize, size_t tCount, void * pvUser )
{
    printf("%*.*s", tSize * tCount, tSize * tCount, pvData);

    char* _data;
    if(pvData != nullptr && 0 != tCount)
    {
        _data = new char[tCount];
       memcpy(_data, pvData, tCount);
    }

    return ( tCount );
}

How can I best do this in C++?? How do I actually inspect and parse the _data array for the information that I want?? Are the any boost libraries that I can use for example??

1
  • An answer not using boost or anything would be highly appreciated. Commented Oct 29, 2015 at 1:08

3 Answers 3

5

You could parse the headers on the fly or put them into a map and post-process later. Use find, substr methods from the std::string. Look at Boost String Algorithms Library, it contains lots of algorithms, e.g. trim

e.g. to place headers into the std::map and print them (rough cuts):

#include <stdlib.h>
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <boost/algorithm/string.hpp>

int main(int argc, char* argv[]) {
  const char* s = "HTTP/1.1 200 OK\r\n"
    "Content-Type: image/jpeg; charset=utf-8\r\n"
    "Content-Length: 19912\r\n\r\n";

  std::map<std::string, std::string> m;

  std::istringstream resp(s);
  std::string header;
  std::string::size_type index;
  while (std::getline(resp, header) && header != "\r") {
    index = header.find(':', 0);
    if(index != std::string::npos) {
      m.insert(std::make_pair(
        boost::algorithm::trim_copy(header.substr(0, index)), 
        boost::algorithm::trim_copy(header.substr(index + 1))
      ));
    }
  }

  for(auto& kv: m) {
    std::cout << "KEY: `" << kv.first << "`, VALUE: `" << kv.second << '`' << std::endl;
  }

  return EXIT_SUCCESS;
}

You will get the output:

KEY: `Content-Length`, VALUE: `19912`
KEY: `Content-Type`, VALUE: `image/jpeg; charset=utf-8`

Having the headers, you could extract the required ones for post-processing.

Sign up to request clarification or add additional context in comments.

1 Comment

As he uses libcurl, you know that he gets a single full header in every callback, which makes it much simpler to parse...
3

I would put all headers in a map, after which you can easily iterate through it. No boost needed. Here a basic working example with libcurl:

#include <iostream>
#include <string>
#include <map>
#include <curl/curl.h>

static size_t OnReceiveData (void * pData, size_t tSize, size_t tCount, void * pmUser)
{
    size_t length = tSize * tCount, index = 0;
    while (index < length)
    {
        unsigned char *temp = (unsigned char *)pData + index;
        if ((temp[0] == '\r') || (temp[0] == '\n'))
            break;
        index++;
    }

    std::string str((unsigned char*)pData, (unsigned char*)pData + index);
    std::map<std::string, std::string>* pmHeader = (std::map<std::string, std::string>*)pmUser;
    size_t pos = str.find(": ");
    if (pos != std::string::npos)
        pmHeader->insert(std::pair<std::string, std::string> (str.substr(0, pos), str.substr(pos + 2)));

    return (tCount);
}

int main(int argc, char* argv[])
{
    CURL *curl = curl_easy_init();
    if (!curl)
        return 1;

    std::map<std::string, std::string> mHeader;

    curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
    curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, OnReceiveData);
    curl_easy_setopt(curl, CURLOPT_HEADERDATA, &mHeader);
    curl_easy_setopt(curl, CURLOPT_NOBODY, true);
    curl_easy_perform(curl);
    curl_easy_cleanup(curl);

    std::map<std::string, std::string>::const_iterator itt;
    for (itt = mHeader.begin(); itt != mHeader.end(); itt++)
    {
        if (itt->first == "Content-Type" || itt->first == "Content-Length")
            std::cout << itt->first << ": " << itt->second << std::endl;
    }
}

Comments

1

The cpp-netlib project (based on boost) contains a full MIME parser (written with boost.spirit).

I'm not really that happy with the interface of the parser, but it works well.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.