0

how can I retrieve both string between STRING & END in this sentence

"This is STRING a222 END, and this is STRING b2838 END."

strings that I want to get:

a222 
b2838

Following is my code, and i only manage to get first string which is a222


string myString = "This is STRING a222 END, and this is STRING b2838 END.";

int first = myString.IndexOf("STRING") + "STRING".Length;
int second= myString.LastIndexOf("END");

string result = St.Substring(first, second - first);

.

1
  • 3
    I swear I saw this post earlier today? Anyway have you tried using Regex? Commented Aug 19, 2020 at 13:37

5 Answers 5

2

Here is the solution using Regular Expressions. Working Code here

var reg = new Regex("(?<=STRING ).*?(?= END)");
var matched = reg.Matches("This is STRING a222 END, and this is STRING b2838 END.");

foreach(var m in matched)
{
   Console.WriteLine(m.ToString());
}
Sign up to request clarification or add additional context in comments.

6 Comments

RegEx is to painfull for these kind of operation.
It's not, the time complexity in finding the matched string will be O(N), N -> is the length of the string. Regular expressions functions in the library are already optimized. Can you please explain why it is so?
Comparing to simple "split" RregEx is expensive. Also we need to Replace method again and again.
@PaulF, Thanks for that, updated the logic as mentioned. I didn't know about that thanks for letting us know the new approach
@SowmyadharGourishetty: Regex Lookahead/behind can be very useful - theres a bit of info here with some links : stackoverflow.com/questions/2973436/…
|
1

You can pass a value for startIndex to string.IndexOf(), you can use this while looping:

    IEnumerable<string> Find(string input, string startDelimiter, string endDelimiter)
    {
        int first = 0, second;

        do
        {
            // Find start delimiter
            first = input.IndexOf(startDelimiter, startIndex: first) + startDelimiter.Length;

            if (first == -1) 
                yield break;


            // Find end delimiter
            second = input.IndexOf(endDelimiter, startIndex: first);

            if (second == -1)
                yield break;


            yield return input.Substring(first, second - first).Trim();
            first = second + endDelimiter.Length + 1;
        }
        while (first < input.Length);
    }

Comments

1

You can iterate over indexes,

string myString = "This is STRING a222 END, and this is STRING b2838 END.";
//Jump to starting index of each `STRING`
for(int i = myString.IndexOf("STRING");i > 0; i = myString.IndexOf("STRING", i+1))
{
    //Get Index of each END
    var endIndex = myString.Substring(i + "STARTING".Length).IndexOf("END");
    //PRINT substring between STRING and END of each occurance
    Console.WriteLine(myString.Substring(i + "STARTING".Length-1, endIndex));
}

.NET FIDDLE


In your case, STRING..END occurs multiple times, but you were getting index of only first STRING and last index of END which will return substring, starts with first STRING to last END.

i.e.

a222 END, and this is STRING b2838 

4 Comments

End maybe a type-o? I'm not sure if it's case sensitive, but if so, END is maybe what the user is after perhaps? Also, could you explain what the OP did wrong in their attempt and what is not working as intended so they could better understand?
@Çöđěxěŕ, thanks for your input. I fixed it. Have a look at my answer
Thanks for the update and great answer with explanation!
You'll have that, it is better to leave a comment as to why, but I've known many don't.
1

You've already got some good answers but I'll add another that uses ReadOnlyMemory from .NET core. That provides a solution that doesn't allocate new strings which can be nice. C# iterators are a common way to transform one sequence, of chars in this case, into another. This method would be used to transform the input string into sequence of ReadOnlyMemory each containing the tokens your after.

    public static IEnumerable<ReadOnlyMemory<char>> Tokenize(string source, string beginPattern, string endPattern)
    {
        if (string.IsNullOrEmpty(source) ||
            string.IsNullOrEmpty(beginPattern) ||
            string.IsNullOrEmpty(endPattern))
            yield break;

        var sourceText = source.AsMemory();

        int start = 0;

        while (start < source.Length)
        {
            start = source.IndexOf(beginPattern, start);

            if (-1 != start)
            {
                int end = source.IndexOf(endPattern, start);

                if (-1 != end)
                {
                    start += beginPattern.Length;
                    yield return sourceText.Slice(start, (end - start));
                }
                else
                    break;

                start = end + endPattern.Length;
            }
            else
            {
                break;
            }
        }
    }

Then you'd just call it like so to iterate over the tokens...

    static void Main(string[] args)
    {
        const string Source = "This is STRING a222 END, and this is STRING b2838 END.";

        foreach (var token in Tokenize(Source, "STRING", "END"))
        {
            Console.WriteLine(token);
        }
    }

Comments

0
string myString = "This is STRING a222 END, and this is STRING b2838 END.";
// Fix the issue based on @PaulF's comment.
if (myString.StartsWith("STRING"))
     myString = $"DUMP {myString}";

var arr = myString.Split(new string[] { "STRING", "END" }, StringSplitOptions.RemoveEmptyEntries);

for (int i = 0; i < arr.Length; i++)
{
      if(i%2 > 0)
      {
          // This is your string
          Console.WriteLine(arr[i].Trim());
      }
}

11 Comments

That only works if STRING & END are paired, and also some text precedes the first STRING
Hi PaulF, I believe he mentioned in the question "how can I retrieve both string between STRING & END in this sentence".
@BijuKalanjoor could you update your post to include what the OP did wrong in their attempt and how this addresses their issue?
@BijuKalanjoor: if all the OP wanted to do was extract the values from that particular string then I would suggest counting the characters & doing 2 substring operations. I am assuming that OP actually wants a generic solution that will work for any string passed to it. I have pointed out two ways your solution will fail to get the results asked for. It may be this answer suits OP though - so if it is marked as the correct answer then I guess it is what is required.
@PaulF, according to this sentence "how can I retrieve both string between STRING & END in this sentence" OP wants to get a string in between two token. That's why i suggest this method rather than regex. if the OP really wants the behavior which you mentioned , I'm completely agree with you.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.