0

I am trying to write a regular expression that can parse the text between < p >< /p > tags. There will be up to 3 lines of text in a row. I thought this might be possible using the (?= search ahead feature.

The code that I am currently using to get one line is as follows.

<p>([^']*?)<[/]p

Is it possible to have one regular expression that can get the text between multiple rows of tags? Each line would need to be in its own group.

An example would be

 <p>The</p>
 <p>Grey</p>
 <p>Fox</p>
2
  • Don't forget to have a look at the most voted answer ever: stackoverflow.com/questions/1732348/… Commented Dec 18, 2009 at 5:49
  • Thanks for link. I have seen that and think I will be safe since this is the only thing from the html I am parsing. Commented Dec 18, 2009 at 5:55

1 Answer 1

2

First, this would be easy using the Html Agility Pack and you'd get a more robust solution.

But you can do it with regex in certain situations if you're 100% in control of the format and the input is coming from a trusted source:

Match match = Regex.Match(html, @"(?:<p>(.*?)</p>\s*)+", RegexOptions.Singleline);
if (match.Success)
{
    foreach (Capture line in match.Groups[1].Captures)
        Console.WriteLine(line.Value);
}

Output:

The
Grey
Fox
Sign up to request clarification or add additional context in comments.

1 Comment

Accident. Left over from testing. (I did actually test it, honest!)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.