1

I have a word which has a section say 1.2.2 and a some text followed by some other texts. I want to get the section. I have created a regex to match the section and some text.

Below is my code:

var word = "1.2.3 area consent testing, sklfjsdlkf jdifgjds visjeflk area consent testing lsdajfgo idsjgosa jfikdjfl343 fjdsl45jl sfgjsoiaetj l area consent testing";
var lowerWord = "area consent testing".ToLower();
var textLower = @word.ToLower().ToString();
Dictionary<int, string> matchRegex = new Dictionary<int, string>();
matchRegex.Add(1, @"(^\d.+(?:\.\d+)*[ \t](" + lowerWord + "))"); 


foreach (var check in matchRegex)
{
    string AllowedChars = check.Value;
    Regex regex = new Regex(AllowedChars);
    var match = regex.Match(textLower);
    if (match.Success)
    {
        var sectionVal = match.Value;
    }
}

Now my problem is, I just want the value 1.2.3 area consent testing in my sectionVal variable, but it is giving me the whole line as it is. i.e.

sectionVal = "1.2.3 area consent testing, sklfjsdlkf jdifgjds visjeflk area consent testing lsdajfgo idsjgosa jfikdjfl343 fjdsl45jl sfgjsoiaetj l area consent testing";
2
  • Shouldn't \d.+ be \d+\.? - escape the first . Commented May 22, 2018 at 14:24
  • $@"^[0-9]+(?:\.[0-9]+)*\s{Regex.Escape(lowerCase)}" Commented May 22, 2018 at 14:30

1 Answer 1

2

The start of your regex contains an unescaped . which will match any character and a + after. Try this:

@"^(\d+(\.\d+)*[ \t](" + lowerWord + "))"
Sign up to request clarification or add additional context in comments.

2 Comments

@Titianm, you are the saviour!
[ \t] can, probably, be turned into \s; + Regex.Escape(lowerCase) + instead of + lowerCase + to be on the safe side

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.