52

I'm trying to parse the following HTML file, I'd like the get the value of key. This is being done on Silverlight for Windows phone.

<HTML>
<link ref="shortcut icon" href="favicon.ico">
<BODY>
<script Language="JavaScript">
location.href="login.html?key=UEFu1EIsgGTgAV7guTRhsgrTQU28TImSZkYhPMLj7BChpBkvlCO11aJU2Alj4jc5"
</script>
<CENTER><a href="login.html?key=UEFu1EIsgGTgAV7guTRhsgrTQU28TImSZkYhPMLj7BChpBkvlCO11aJU2Alj4jc5">Welcome</a></CENTER></BODY></HTML>

any idea's on where to go from here?

thanks

4
  • 1
    I just added a question to the Software Recommendations Stack Exchange site for this – C# library for parsing HTML? - Software Recommendations Stack Exchange. Commented Aug 15, 2014 at 23:30
  • The question this duplicates has been closed... So this one should probably be reopened. Commented Jun 26, 2019 at 19:47
  • @Andrew the other question wasn't on-topic either. By inference it would make sense to close this one. Commented Jun 26, 2019 at 21:51
  • @Andrew The dup question isn't much better than this one but it already has a long list of answers with a high number of votes. Commented Jun 26, 2019 at 21:52

2 Answers 2

77

Give the HTMLAgilityPack a look into. Its a pretty decent HTML parser

http://html-agility-pack.net/?z=codeplex

Here's some code to get you started (requires error checking)

HtmlDocument document = new HtmlDocument(); 
string htmlString = "<html>blabla</html>";
document.LoadHtml(htmlString);
HtmlNodeCollection collection = document.DocumentNode.SelectNodes("//a");
foreach (HtmlNode link in collection)
{
     string target = link.Attributes["href"].Value;
}
Sign up to request clarification or add additional context in comments.

4 Comments

+1 I've used this tool before and it's great.
We do a lot of scraping using Agility pack and it rocks. Definitely try this.
i dont think you can use the agility pack for windows phone.
Agility pack works with windows phone. Developing an app with it now, works great.
-3

You can use regular expression (Regex class) for it. The expression can be something like that: login.html\?key=[^"]*

6 Comments

I won't downvote because I'm nice but RegEx isn't a sure fire way to do this anymore, rather HTMLAgilityPack is pretty much gold standard these days.
-1 (unfortunately I'm being fair - nothing to do with nice - and this info will help you as well to not try to attempt to parse HTML using RexEx) stackoverflow.com/questions/1732348/…
Regex may work but I highly suggest otherwise, for the future.
Though it's generally not right to parse html with regex, for the given scenario (where you only need to extract a single little piece), they might be a simple, lightweight and straight solution. It depends on how fast and deep you expect the html to change.
Yes, I agree that regex isn't for parsing html, but for simple solution it can be ok. If all you need is to take one value from a file and for that you will add assembly to you program (the size of your app will be bigger) I'm not sure if it's wise. For me at least there is no one trut and everything depends from the context.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.