1

I have something like this

<h1> I am Text </h1>

in asp.net MVC

If I want to write it by <%: mystring%> that he show the tags who filled by someone.

How I can decode it then show " I am Text "?

I just want to show the source from CKEDITOR to page without HTML tags. If I use regex then the all tags hide even user use to fill the information.

1
  • @Steven, can you please correct your question and grammer? Its not clear what you want to achieve. If you have written HTML tags in your question then instead of using angular brackets you have to use &lt; and &gt; Commented Nov 1, 2010 at 5:19

4 Answers 4

4

Use library such as HTML Agility Pack. You may use Regular Expressions but I wouldn't recommend it.

EDIT: Here's the sample code that does html to text conversion - http://htmlagilitypack.codeplex.com/SourceControl/changeset/view/66017#1336937

Sign up to request clarification or add additional context in comments.

7 Comments

@VinayC: Could you show how to strip tags using HTML Agility Pack?
@zerkms, if I can remember correctly, I had used InnerText property for getting plain text. Will try to find some sample.
@VinayC: will it work if you just copy-paste current page as a string and apply that method for it?
@zerkms, when I used it, I was parsing html fragments that were quite simple. So I have no clue (from my usage) but googling over, it appears that copy-paste document with InnerText property may not work. Some one has even resorted to using his own parser implementation- social.msdn.microsoft.com/Forums/en/regexp/thread/…. I believe that full-proof solution will involve iterating over various nodes and taking decisions based on node types. The most perplexing fact is that I couldn't find a single sample :-(
@VinayC: "I couldn't find a single sample" --- I bet that's because there is no simple sample to do that ;-)
|
3
String result = Regex.Replace(htmlDocument, @"<[^>]*>", String.Empty);

More:

Comments

0

Yes, I will be burned in hell for using regexes to work with html

var result = Regex.Replace(inputString, '<[^>]*>', string.Empty);

but this solution looks much simple than building FSM.

1 Comment

we can make your regex non-greedy by using *?
0

You can use Regex to remove those html tags. I hope you are aware of Regex class in C#.

Use this regex : <[^>]*?>

Replase all the matches with blank string ""

4 Comments

What is the reason to use ? sign after *?
@zerkms, well... ? after * makes regex non greedy. Its a vast topic and difficult to explain here. You can find more information about it here : regular-expressions.info/repeat.html
in this case ? quantifier is useless. Otherwise - think out any input string which will give different results on that 2 regexes.
@zerkms, You are right. Reason behind using ? after * was in situation where nested html tags are present. for example "<h1> I <b>am</b> Text </h1> " . anyways.. ur regex is more correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.