3

I have a string like:

BLOCK
    LIST1 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.
        LIST2 Lorem ipsum dolor sit amet.
        LIST2 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.
BLOCK
    LIST1 Lorem ipsum dolor sit amet.
        LIST2 Lorem ipsum dolor sit amet.
            LIST3 Lorem ipsum dolor sit amet.
        LIST2 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.
    LIST1 Lorem ipsum dolor sit amet.

and want to transform it into

1. Lorem ipsum dolor sit amet.
    1. Lorem ipsum dolor sit amet.
    2. Lorem ipsum dolor sit amet.
    3. Lorem ipsum dolor sit amet.
        A. Lorem ipsum dolor sit amet.
        B. Lorem ipsum dolor sit amet.
    4. Lorem ipsum dolor sit amet.
2. Lorem ipsum dolor sit amet.
    1. Lorem ipsum dolor sit amet.
        A. Lorem ipsum dolor sit amet.
            a. Lorem ipsum dolor sit amet.
        B. Lorem ipsum dolor sit amet.
    2. Lorem ipsum dolor sit amet.
    3. Lorem ipsum dolor sit amet.
    4. Lorem ipsum dolor sit amet.

in another question, Numbering list elements with Regex in C#, dtb used single counter for every level, but I have a char array containing letters (A, B, C, D..) and want to use it for different levels.

2
  • Can the items have line breaks? Commented Sep 15, 2011 at 17:31
  • Nope reading line by line is not possible. My data is not well structured. Commented Sep 15, 2011 at 17:35

2 Answers 2

3

Similar to Numbering list elements with Regex in C#

var input = "BLOCK\r\n    LIST1 Lorem ipsum dolor sit amet ...";

var levels = new List<string> { "BLOCK", "LIST1", "LIST2", "LIST3" };
var counter = levels.ToDictionary(level => level, level => 0);

// Replace each key word with incremented counter,
// while resetting deeper levels to 0.
var result = Regex.Replace(input, string.Join("|", levels), m =>
{
    for (int i = levels.IndexOf(m.Value) + 1; i < levels.Count; i++)
    {
        counter[levels[i]] = 0;
    }
    return GetLevelToken(m.Value, ++counter[m.Value]);
});

private static string GetLevelToken(string token, int index)
{
    switch (token)
    {
        case "BLOCK":
            return index.ToString() + ".";
        case "LIST1":
            return index.ToString() + ".";
        case "LIST2":
            return ((char)('A' + index - 1)).ToString();
    }
    return "";
}
Sign up to request clarification or add additional context in comments.

Comments

1

It may be easier to do this the old fashion way:

  1. Place the string into an array
  2. Iterate through the array replacing the "BLOCK" with the appropriate line and the "LIST1" with the appropriate line number and each "LIST2" with the appropriate line number while inserting the results into a new string or an array of strings.

That would make it straight forward and work with a single loop.

1 Comment

And if you can have really big strings where performance is a concern, you don't even need the array. Just read the lines one at a time keeping track of the current indentation levels.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.