4

I'm using a regex to separate the fields of an HTTP request:

GET /index.asp?param1=hello&param2=128 HTTP/1.1

This way:

smatch m;
try 
{ 
    regex re1("(GET|POST) (.+) HTTP"); 
    regex_search(query, m, re1); 
} 
catch (regex_error e) 
{ 
    printf("Regex 1 Error: %d\n", e.code()); 
}
string method = m[1]; 
string path = m[2];

try 
{ 
    regex re2("/(.+)?\\?(.+)?"); 
    if (regex_search(path, m, re2)) 
    { 
        document = m[1]; 
        querystring = m[2];
    }
} 
catch (regex_error e) 
{ 
    printf("Regex 2 Error: %d\n", e.code()); 
}

Unfortunately this code works in MSVC but not with GCC 4.8.2 (which I have on Ubuntu Server 14.04). Can you suggest a different method of splitting that string using maybe normal std::string operators?

I don't know how to split the URL in different elements since the query string separator '?' may or may not be present in the string.

7
  • You could consider using the boost regex library ( boost.org/doc/libs/1_57_0/libs/regex/doc/html/index.html ) Commented Feb 1, 2015 at 22:01
  • Please give some more input. Your code cannot work if (what you mentioned) the query string separator is missing. So what are the inputs on both platforms? @Christophe: always pointing to boost may not always be a good hint. Commented Feb 1, 2015 at 22:04
  • @St0fF Sorry, but his code can work: I cut and pasted it into MSVC2013 and got the expected results (method="GET", document="index.asp", querrystring="param1=hello&param2=128" ) Commented Feb 1, 2015 at 22:17
  • @MarkMiles Could you please tell us what doesn't work ? I tested your code on ideone ( ideone.com/QZqQM1 ) and it also returned correct results (with gcc 4.9.2) Commented Feb 1, 2015 at 22:20
  • As I wrote in my post: "Unfortunately this code works in MSVC but not with GCC 4.8.2". It must work on Ubuntu 14.04. If I do #gcc --version it says it's 4.8.2 but if I do apt search gcc-4.9 I get gcc-4.9-base/trusty,now 4.9-20140406-0ubuntu1 armhf [installed] GCC, the GNU Compiler Collection (base package) so I don't know how to update my gcc. Commented Feb 1, 2015 at 22:33

3 Answers 3

7

You might use std::istringstream to parse this:

int main()
{
    std::string request = "GET /index.asp?param1=hello&param2=128 HTTP/1.1";

    // separate the 3 main parts

    std::istringstream iss(request);

    std::string method;
    std::string query;
    std::string protocol;

    if(!(iss >> method >> query >> protocol))
    {
        std::cout << "ERROR: parsing request\n";
        return 1;
    }

    // reset the std::istringstream with the query string

    iss.clear();
    iss.str(query);

    std::string url;

    if(!std::getline(iss, url, '?')) // remove the URL part
    {
        std::cout << "ERROR: parsing request url\n";
        return 1;
    }

    // store query key/value pairs in a map
    std::map<std::string, std::string> params;

    std::string keyval, key, val;

    while(std::getline(iss, keyval, '&')) // split each term
    {
        std::istringstream iss(keyval);

        // split key/value pairs
        if(std::getline(std::getline(iss, key, '='), val))
            params[key] = val;
    }

    std::cout << "protocol: " << protocol << '\n';
    std::cout << "method  : " << method << '\n';
    std::cout << "url     : " << url << '\n';

    for(auto const& param: params)
        std::cout << "param   : " << param.first << " = " << param.second << '\n';
}

Output:

protocol: HTTP/1.1
method  : GET
url     : /index.asp
param   : param1 = hello
param   : param2 = 128
Sign up to request clarification or add additional context in comments.

1 Comment

I accept this as an answer because also solves me the problem to split the query string, thanks!
2

The reason why it's not working with gcc 4.8.2 is that regex_search is not implemented in stdlibc++. If you look inside regex.h here is what you get:

template<typename _Bi_iter, typename _Alloc,
    typename _Ch_type, typename _Rx_traits>
inline bool
regex_search(_Bi_iter __first, _Bi_iter __last,
        match_results<_Bi_iter, _Alloc>& __m,
        const basic_regex<_Ch_type, _Rx_traits>& __re,
        regex_constants::match_flag_type __flags
        = regex_constants::match_default)
{ return false; }

Use regex_match instead, which is implemented. You would have to modify your regex (eg, add .* before and after) as regex_match matches the entire string.

Alternatives:

  1. Upgrade to gcc 4.9
  2. Use boost::regex instead
  3. Switch to LLVM and libc++ (my preference).

Comments

1

If you want to avoid the use of regex, you can use standard string operations:

string query = "GET / index.asp ? param1 = hello&param2 = 128 HTTP / 1.1";
string method, path, document, querystring;

try {
    if (query.substr(0, 5) == "GET /")   // First check the method at the beginning 
        method = "GET";
    else if (query.substr(0, 6) == "POST /")
        method = "POST";
    else  throw std::exception("Regex 1 Error: no valid method or missing /");

    path = query.substr(method.length() + 2);  // take the rest, ignoring whitespace and slash
    size_t ph = path.find(" HTTP");     // find the end of the url
    if (ph == string::npos)            // if it's not found => error
        throw std::exception("Regex 2 Error: no HTTP version found");
    else path.resize(ph);             // otherwise get rid of the end of the string 

    size_t pq = path.find("?");      // look for the ? 
    if (pq == string::npos) {        // if it's absent, document is the whole string
        document = path;
        querystring = "";
    }
    else {                          // orherwie cut into 2 parts 
        document = path.substr(0, pq);
        querystring = path.substr(pq + 1);
    }

    cout << "method:     " << method << endl
        << "document:   " << document << endl
        << "querystring:" << querystring << endl;
}
catch (std::exception &e) {
    cout << e.what();
}

Of course, this code is not so nice than your original regex base one. So it's to be seen as a workaround if you cannot use an uptodate version of the compiler.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.