5

I have a Visual Studio 2008 C++ project where I need to parse a string to a structure of c-style character arrays. What is the most elegant/efficient way of doing this?

Here is my current (functioning) solution:

struct Foo {
    char a[ MAX_A ];
    char b[ MAX_B ];
    char c[ MAX_C ];
    char d[ MAX_D ];
};

Func( const Foo& foo );

std::string input = "abcd@efgh@ijkl@mnop";
std::vector< std::string > parsed;
boost::split( parsed, input, boost::is_any_of( "@" ) );

Foo foo = { 0 };
parsed[ 1 ].copy( foo.a, MAX_A );
parsed[ 2 ].copy( foo.b, MAX_B );
parsed[ 3 ].copy( foo.c, MAX_C );
parsed[ 4 ].copy( foo.d, MAX_D );

Func( foo );
4
  • I don't see anything wrong with this solution except that if you will be frequently adding to your structure then it may be a maintenance pain as the assignment to foo.a, foo.b etc. doesn't follow the DRY principle. Do you think you will be frequently adding to this structure? Commented Mar 16, 2012 at 20:11
  • I don't think I'll be adding to it often. I'm mostly concerned with the number of times I copy those strings. Is there a way to minimize that? Commented Mar 16, 2012 at 20:25
  • 2
    I don't think so. If you had a single string to copy from you could have simply copied it to the structure's beginning address. In your case it would have worked as you only have chars in your structure and they are single byte aligned. Although I wouldn't recommend doing this as I think it's ugly. But you can't do it anyway since you are copying from vector so you have to reference each string one by one. I think what you have is ok. Commented Mar 16, 2012 at 20:30
  • have you considered placing the whole thing into a separate function, e.g. Foo makeFooFromString(std::string &input) Commented Mar 16, 2012 at 21:24

3 Answers 3

5

Here is my (now tested) idea:

#include <vector>
#include <string>
#include <cstring>

#define MAX_A 40
#define MAX_B 3
#define MAX_C 40
#define MAX_D 4

struct Foo {
    char a[ MAX_A ];
    char b[ MAX_B ];
    char c[ MAX_C ];
    char d[ MAX_D ];
};

template <std::ptrdiff_t N>
const char* extractToken(const char* inIt, char (&buf)[N])
{
    if (!inIt || !*inIt)
        return NULL;

    const char* end = strchr(inIt, '@');
    if (end)
    {
        strncpy(buf, inIt, std::min(N, end-inIt));
        return end + 1;
    } 
    strncpy(buf, inIt, N);
    return NULL;
}

int main(int argc, const char *argv[])
{
    std::string input = "abcd@efgh@ijkl@mnop";

    Foo foo = { 0 };

    const char* cursor = input.c_str();
    cursor = extractToken(cursor, foo.a);
    cursor = extractToken(cursor, foo.b);
    cursor = extractToken(cursor, foo.c);
    cursor = extractToken(cursor, foo.d);
}

[Edit] Tests

Adding a little test code

template <std::ptrdiff_t N>
std::string display(const char (&buf)[N])
{
    std::string result;
    for(size_t i=0; i<N && buf[i]; ++i)
       result += buf[i];
    return result; 
}

int main(int argc, const char *argv[])
{
    std::string input = "abcd@efgh@ijkl@mnop";

    Foo foo = { 0 };

    const char* cursor = input.c_str();
    cursor = extractToken(cursor, foo.a);
    cursor = extractToken(cursor, foo.b);
    cursor = extractToken(cursor, foo.c);
    cursor = extractToken(cursor, foo.d);

    std::cout << "foo.a: '" << display(foo.a) << "'\n";
    std::cout << "foo.b: '" << display(foo.b) << "'\n";
    std::cout << "foo.c: '" << display(foo.c) << "'\n";
    std::cout << "foo.d: '" << display(foo.d) << "'\n";
}

Outputs

foo.a: 'abcd'
foo.b: 'efg'
foo.c: 'ijkl'
foo.d: 'mnop'

See it Live on http://ideone.com/KdAhO

Sign up to request clarification or add additional context in comments.

1 Comment

Added a test. See it live on http://ideone.com/KdAhO
1

What about redesigning Foo?

struct Foo {
  std::array<std::string, 4> abcd;
  std::string a() const { return abcd[0]; }
  std::string b() const { return abcd[1]; }
  std::string c() const { return abcd[2]; }
  std::string d() const { return abcd[3]; }
};


boost::algorithm::split_iterator<std::string::iterator> end,
    it = boost::make_split_iterator(input, boost::algorithm::first_finder("@"));
std::transform(it, end, foo.abcd.begin(),
               boost::copy_range<std::string, decltype(*it)>);

1 Comment

What about inventing a whole other question and then answering that...?
1

using a regex would look like this (in C++11, you can translate this to boost or tr1 for VS2008):

// Assuming MAX_A...MAX_D are all 10 in our regex

std::cmatch res;
if(std::regex_match(input.data(),input.data()+input.size(),
                    res,
                    std::regex("([^@]{0,10})([^@]{0,10})([^@]{0,10})([^@]{0,10})")))
{
    Foo foo = {};
    std::copy(res[1].first,res[1].second,foo.a);
    std::copy(res[2].first,res[2].second,foo.b);
    std::copy(res[3].first,res[3].second,foo.c);
    std::copy(res[4].first,res[4].second,foo.d);
}

You should probably create the pattern using a format string and the actual MAX_* variables rather than hard coding the values in the regex like I did here, and you might also want to compile the regex once and save it instead of recreating it every time.

But otherwise, this method avoids doing any extra copies of the string data. The char *s held in each submatch in res is a pointer directly into the input string's buffer, so the only copy is directly from the input string to the final foo object.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.