1

First let me post you some example strings:

string_position = ("\"%s\";\"%s\";\"%s\";\"\";\"%s\"\r\n\"%s\";\"%s\";\"%s\";\"%s - %s\";\"%s\";\"%.0f\";\"FR\";\"%.2f\";\"%.2f\";\"%.2f\";\"%s\";\"%s\";\"%s\";\"%s\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"B\"\r\n",POSNR_NR_ID,POSNR_NR_ID,POSNR,POSNR_NR_ID,ARTNR_NR_ID,POSNR_NR_ID,CP90NAME,TEXT1,TEXT2,ARTNR_NR_ID,CNT,WIDTH,HEIGHT,DEPTH,INFO1,INFO2,INFO3,INFO4)

string_position = ("STK_PD_BEZ|%s|STK_ID|%s|STK_EBENE|0|ID|%s\r\nSTK_ID|%s|ORDERPOS|%s|STK_EBENE|1|STK_PD_BEZ|%s|STK_FLAENGE|%.2f|STK_FBREITE|%.2f|STK_FDICKE|%.2f|ID|%s|PARENTID|%s\r\n",POSNR,ORDERID,POSNR_NR_ID,ORDERID,POSSTR,CP90NAME,WIDTH,DEPTH,HEIGHT,ARTNR_NR_ID,POSNR_NR_ID)

So I want to parse those strings, but I don't know how I could start. As I result I want to have two arrays for each string, for example (string 2):

array_a[0] = STK_PD_BEZ|%s;
array_b[0] = POSNR;

array_a[1] = STK_ID|%s;
array_b[1] = ORDERID;

etc...

I hope you understand my problem. I have to find the complementary "variable" to each %s. So the algorithm has to work with any string that looks like the ones I've posted.

Thank you for any help.

6
  • You could improve a little by not naming your samples the same and by explaining the differences and similarities between the 2 strings. I'm lost what to do with example 1 Commented Oct 21, 2010 at 7:50
  • so you need everything btween %s of just the stuff right before Commented Oct 21, 2010 at 7:52
  • @rerun and Henk: Take a look at the second STRING_POSITION. That's what I tried to explain with my 2 arrays. The "STK_PD_BEZ|%s" belongs to (if you srcroll right) to "POSNR". And that's what the parser has to find out. (Which %s belongs to the complementary variable) Commented Oct 21, 2010 at 7:55
  • You can try to experiment with RegExr: gskinner.com/RegExr to study and test new expressions Commented Oct 21, 2010 at 7:55
  • Still don't understand what the 1st string is for. Just to confuse us? Commented Oct 21, 2010 at 8:21

1 Answer 1

1

Just quick implementation, hope it will be useful. I don't go with regex for this particualr task. I think simple parser will be enough here.

        // const string test = "STK_PD_BEZ|%s|STK_ID|%s|STK_EBENE|0|ID|%s\r\nSTK_ID|%s|ORDERPOS|%s|STK_EBENE|1|STK_PD_BEZ|%s|STK_FLAENGE|%.2f|STK_FBREITE|%.2f|STK_FDICKE|%.2f|ID|%s|PARENTID|%s\r\n,POSNR,ORDERID,POSNR_NR_ID,ORDERID,POSSTR,CP90NAME,WIDTH,DEPTH,HEIGHT,ARTNR_NR_ID,POSNR_NR_ID";

        const string test = "\"%s\";\"%s\";\"%s\";\"\";\"%s\"\r\n\"%s\";\"%s\";\"%s\";\"%s - %s\";\"%s\";\"%.0f\";\"FR\";\"%.2f\";\"%.2f\";\"%.2f\";\"%s\";\"%s\";\"%s\";\"%s\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"\";\"B\"\r\n,POSNR_NR_ID,POSNR_NR_ID,POSNR,POSNR_NR_ID,ARTNR_NR_ID,POSNR_NR_ID,CP90NAME,TEXT1,TEXT2,ARTNR_NR_ID,CNT,WIDTH,HEIGHT,DEPTH,INFO1,INFO2,INFO3,INFO4";

        // [0] - format string
        // [1..n] - arguments for format
        string[] args = test.Split(',');

        // Source parts divided by delimiters. You can extend it.
        string[] parts = args[0].Split("|\r\n;-".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
        
        // Format - arg pair
        var parsed = new List<Tuple<string, string>>();

        // Current format string
        var format = new List<string>();
        
        // Start from 1 since we skip format string
        int currentValue = 1;

        // Building
        foreach (var part in parts)
        {
            if (part.Contains("%"))
            {
                format.Add(part);
                parsed.Add(Tuple.Create(string.Join("|", format), args[currentValue++]));

                format.Clear();
            }
            else format.Add(part);
        }

        // Printing
        foreach (var pair in parsed)
        {
            Console.WriteLine("{0} = {1}", pair.Item1, pair.Item2);
        }

        Console.ReadLine();

Output:

STK_PD_BEZ|%s = POSNR

STK_ID|%s = ORDERID

STK_EBENE|0|ID|%s = POSNR_NR_ID

STK_ID|%s = ORDERID

ORDERPOS|%s = POSSTR

STK_EBENE|1|STK_PD_BEZ|%s = CP90NAME

STK_FLAENGE|%.2f = WIDTH

STK_FBREITE|%.2f = DEPTH

STK_FDICKE|%.2f = HEIGHT

ID|%s = ARTNR_NR_ID

PARENTID|%s = POSNR_NR_ID

Output2:

"%s" = POSNR_NR_ID

"%s" = POSNR_NR_ID

"%s" = POSNR

""|"%s" = POSNR_NR_ID

"%s" = ARTNR_NR_ID

"%s" = POSNR_NR_ID

"%s" = CP90NAME

"%s = TEXT1

%s" = TEXT2

"%s" = ARTNR_NR_ID

"%.0f" = CNT

"FR"|"%.2f" = WIDTH

"%.2f" = HEIGHT

"%.2f" = DEPTH

"%s" = INFO1

"%s" = INFO2

"%s" = INFO3

"%s" = INFO4


UPDATE:

Without formal specification parser's code will be rather empirical than formally valid. So first of all I would recommend start with making specification for your input then you can easily make parser which would accept all valid strings. For example you can start with Syntax diagrams

Sign up to request clarification or add additional context in comments.

2 Comments

This works for my second string, but as a result for my first string the output is empty...
You are not provide any specification for input strings, output results. What can be treated as delimiters? Do you need to preserve formatting (i.e. spacers, quotes etc)? How to detect format description is ended? All I have just two samples. So I just write very straightful example how you can start. I'll update my code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.