2

I have a string like this:

"\r color=\"red\" name=\"Jon\" \t\n depth=\"8.26\" "

And I want to parse this string and create a std::list of this object:

class data
{
    std::string name;
    std::string value;
};

Where for example:

name = color
value = red

What is the fastest way? I can use boost.

EDIT:

This is what i've tried:

vector<string> tokens;
split(tokens, str, is_any_of(" \t\f\v\n\r"));

if(tokens.size() > 1)
{
    list<data> attr;
    for_each(tokens.begin(), tokens.end(), [&attr](const string& token)
        {
            if(token.empty() || !contains(token, "="))
                return;

            vector<string> tokens;
            split(tokens, token, is_any_of("="));
            erase_all(tokens[1], "\"");
            attr.push_back(data(tokens[0], tokens[1]));
        }
    );
}

But it does not work if there are spaces inside " ": like color="red 1".

9
  • Fastest to write, fastest to compile, or fastest at runtime? Commented Jul 2, 2012 at 20:01
  • Fastest to gain self awareness? Commented Jul 2, 2012 at 20:03
  • I'm not inclined to write the actual code for a homework answer, but if it were me, I'd use Boost.Xpressive or Boost.Spirit.Qi. Commented Jul 2, 2012 at 20:08
  • Simply put, it entails work. What have you tried? Since you can use boost, use Boost.Tokenizer. You may have to preprocess your input string to clean off the escape sequences though. Commented Jul 2, 2012 at 20:16
  • 2
    @Nick : Answering that would be writing the actual code for you. ;-] Commented Jul 2, 2012 at 20:22

2 Answers 2

1

Assuming that there will always be at least one white-space before the name, i think the following algorithm is fast enough:

list<data> l;
size_t fn, fv, lv = 0;

while((fv = str.find("\"", ++lv)) != string::npos &&
    (lv = str.find("\"", fv+1)) != string::npos)
{
    fn = str.find_last_of(" \t\n\v\f\r", fv);
    l.push_back(data(str.substr(++fn, fv-fn-2), str.substr(++fv, lv-fv)));
}

Where str is your std::string and data has a constructor of this type:

data(string name, string value)
    : name(name), value(value)
{   }

As you can see there was no need to use boost or regex, simply the standard library.

Sign up to request clarification or add additional context in comments.

1 Comment

Pendantic, but as of C++11, regex is part of the standard library. :-]
0

after your edit: you could do the following for the space problem:

(replace all spaces that are not within " " quotes with a \n)

void PrepareForTokanization(std::string &str)
{
    int quoteCount = 0;
    int strLen = str.length();
    for(int i=0; i<strLen; ++i){
        if (str[i] == '"' && (i==0 || (str[i-1] != '\\')))
            quoteCount++;
        if(str[i] == ' ' && quoteCount%2 == 0)
            str[i] = '\n';
    }
}

and before you call split, prepare the string, and then remove the space character from the split is_any_of

PrepareForTokanization(str);
split(tokens, str, is_any_of("\t\f\v\n\r"));

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.