0

I have html which contains such text

.......
<a class="product_name" href="index.php?productID=29785">Funny</a>
........
<a class="product_name" href="index.php?productID=29787">Very Funny</a>
......

I'd like to href attribute value and text into link so I'd like to get

"index.php?productID=29785", "Funny"
"index.php?productID=29787", "Very Funny"

And I use

MatchCollection mc = Regex.Matches(pageData, 
   "<a class=\"product_name\" href=\"(.+)\">(.+)</a>");

For this. But when I debug code I saw that mc.Count = 0

I think I didn't escaped quotes properly, but I don't know.

3
  • 5
    Parsing html with regex is infamously not a good idea Commented Nov 5, 2011 at 21:03
  • 1
    I get count=2 here, btw, with capture-groups that work as expected. The regex shown works for the html shown. If it isn't working, then either a: you aren't presenting the scenario identically, or b: the html is more complicated, making it insanely hard for all the many reasons that you shouldn't parse html with regex Commented Nov 5, 2011 at 21:05
  • 1
    Agreed. It works here as well (regexhero.net/tester) Commented Nov 5, 2011 at 21:09

2 Answers 2

5

Don't parse HTML with regex. See here for a compelling reason why.

Use the HTML Agility Pack instead.

Sign up to request clarification or add additional context in comments.

3 Comments

Regex is not good for parsing html, but I wouldn't include a new dependency to my project for such a simple task.
@L.B - What would you suggest then? Writing your own parser/tokenizer?
@L.B - You seem to be contradicting yourself. "Regex is not good for parsing html" ... "I would use Regex".
-1

Review the following threads to find possible solution(s):

http://www.dotnetperls.com/scraping-html

Regex to Parse Hyperlinks and Descriptions

Parse HTML links using C#

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.