0

I wish to find a match of *|END_OF_PARAM|ABC_XYZ_123.txt in a given string (meaning string starts with anything but containing |END_OF_PARAM| followed by a filename(which has alphabets, numbers, _ ,-) and ending in ".txt").

eg:

string input = "| AB|3|20200914-01|5| | |END_OF_PARAM|ABC-XYZ-20200914-PIA-03_05_20200914132900.txt";
string pattern = @"*/([A-Za-z0-9\-]+)\.txt$"   // What exactly should go here?
Match match = Regex.Match(input, @"*/([A-Za-z0-9\-]+)\.txt$",
        RegexOptions.IgnoreCase);

if (match.Success)
{
    Console.WriteLine("Match!"));
}

Output I need is ABC-XYZ-20200914-PIA-03_05_20200914132900.txt

ps : Somelines end with |END_OF_PARAM| and don't have the filename after them, such lines should be ignored.

I don't know much of RegEx, tried to learn it and get my task done but it's taking longer than expected. Let me know if any additional data is needed. Thank you.

5
  • You could use a capturing group ^.*(\|END_OF_PARAM\|[\w-]+\.txt)$ regex101.com/r/F2Kl3G/1 Commented Sep 14, 2020 at 10:16
  • Are you readine the content line by line, or are you trying to extract parts of strings from a long multiline text? Commented Sep 14, 2020 at 10:22
  • If it is a multiline block of text, you may use Regex.Matches(text, @"(?<=\|END_OF_PARAM\|)[A-Za-z0-9_-]+\.txt(?=\r?$)").Cast<Match>().Select(x => x.Value) Commented Sep 14, 2020 at 10:31
  • @WiktorStribiżew I'm reading the content line by line. Thanks for your letting me know how to use Regex for multiline text. Commented Sep 14, 2020 at 11:15
  • Ok, so you may use The4thbird's answer. Commented Sep 14, 2020 at 11:16

1 Answer 1

2

The character class is missing an underscore to match the filename in total. If you want to include the |END_OF_PARAM| part you should add it to the match.

To differentiate the filename from the total match, you could capturing it in a group and get that value.

^.*\|END_OF_PARAM\|([A-Za-z0-9_-]+\.txt)$

Explanation

  • ^ Start of string
  • .* Match any char except a newline 0+ times
  • \|END_OF_PARAM\| Match END_OF_PARAM between a pipe at the left and right
  • ( Capture group 1
    • [A-Za-z0-9_-]+\.txt Match 1+ times any of the listed chars followed by .txt
  • ) Capture group 1
  • $ End of string

Regex demo | C# demo

enter image description here

string input = "| AB|3|20200914-01|5| | |END_OF_PARAM|ABC-XYZ-20200914-PIA-03_05_20200914132900.txt";
string pattern = @"^.*\|END_OF_PARAM\|([A-Za-z0-9_-]+\.txt)$";
Match match = Regex.Match(input, pattern);

if (match.Success)
{
    Console.WriteLine(match.Groups[1]);
}

Output

ABC-XYZ-20200914-PIA-03_05_20200914132900.txt
Sign up to request clarification or add additional context in comments.

3 Comments

It's correct yes, i use regexr.com, if you go in the Detail tab finds the correct match
Can you just update your answer or add another comment which briefly explains your pattern so that others who view this post when similar to a question could also get something from your answer.
@m_beta I have added a breakdown of the pattern.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.