2

What's the best approach to parsing std::string to some numeric type in C++, when the target type isn't known in advance?

I've looked at lexical_cast, but that takes the target type as a template parameter. I could write wrapper functions that abuse this by catching bad_lexical_cast and returning false, but that seems ugly.

My input values will typically be int or float and have extremely simple formatting, but something that's flexible would be great!

1
  • 1
    I don't think catching the exception is abuse, btw, but I wouldn't write a wrapper per type that returns false, rather have a single function that tries all permitted formats. An exception is a perfectly reasonable way to indicate that a string doesn't match an expected format. It need never propagate outside the function that knows that it doesn't know the proper format, and is "taking risks" by trying a few different formats in turn. Commented Feb 20, 2012 at 10:44

2 Answers 2

3

You could use either Boost Spirit Numerical Parsers or (ab)use Boost Lexicalcast.

Boost Spirit allows you fine grained control of the format accepted, see e.g.

Here is a quick demo, that also shows how you could detect several possible numeric input formats (progressively) and return the type that was matched. Of course that could be overkill, but it should demonstrate how to use Spirit further.

The demo also shows how to advance the input iterator so you can easily continue parsing where the numeric input ended.

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

enum numeric_types
{
    fmt_none,
    fmt_float,
    fmt_double,
    fmt_uint,
    fmt_int,
    // fmt_hex, etc. 
};

template <typename It>
    bool is_numeric(It& f, It l, numeric_types& detected)
{
    return qi::phrase_parse(f,l,
            qi::uint_   [ qi::_val = fmt_uint   ]
          | qi::int_    [ qi::_val = fmt_int    ]
          | qi::float_  [ qi::_val = fmt_float  ]
          | qi::double_ [ qi::_val = fmt_double ]
           ,qi::space, detected);
}

template <typename It>
    bool is_numeric(It& f, It l)
{
    numeric_types detected = fmt_none;
    return is_numeric(f, l, detected);
}

int main()
{
    const std::string input = "124, -25, 582";
    std::string::const_iterator it = input.begin();

    bool ok = is_numeric(it, input.end());

    if (ok)   
    {
        std::cout << "parse success\n";
        if (it!=input.end()) 
            std::cerr << "trailing unparsed: '" << std::string(it,input.end()) << "'\n";
    }
    else 
        std::cerr << "parse failed: '" << std::string(it,input.end()) << "'\n";

    return ok? 0 : 255;
}
Sign up to request clarification or add additional context in comments.

2 Comments

One note: using Qi you can even go one step further. Using a boost::variant<unsigned,int,float,double> you could not only detect the type, but also stored the parsed result in one go. I don't quite remember how to hook it up, but alternatives and variants can definitely be combined :)
@MatthieuM. I do, but that was not the question, as I read it. In fact it would probably be better to just use qi::double_ as it should match the other options, normally. And store the parser in a static. And wrap it in qi::wrap... But I'm getting ahead
1

When you actually parse the data to convert it, you need to know the type in which to put the results; C++ is a statically typed language, and there's no way around that. If you have a string, and want to know what type it is, using regular expressions is a simple solution:

"\\s*[+-]?(?:"
    "\\d+\\.\\d*(?:[Ee][+-]?\\d+)?"
    "|\\.\\d+(?:[Ee][+-]?\\d+)?"
    "|\\d+[Ee][+-]?\\d+"
")"

should match any possible floating point value, and:

"\\s*[+-]?(?:"
    "[1-9][0-9]*"
    "|0[0-7]*"
    "|0x[0-9a-fA-F]+"
)"

matches an integer in any base. (Supposing the default configuration of Boost or the C++11 regular expressions.)

2 Comments

And be careful that those regexs match strings that are not valid (due to overflow and so on), so the parser you end up using needs to validate anyway.
@SteveJessop Yes. Once you've established what you want to convert to, you have to check that the conversion also worked. The regular expressions only allow choosing the target type, and syntax checking; they can't help with semantic verification.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.