1

I want to remove inner style from html by using c#. Here is my Html text

    <span style="font-family: tahoma; color: #9bbb59;">This is a simple text.</span><br />
<table>
    <thead>
    </thead>
    <tbody>
        <tr>
            <td>&nbsp;R1C1</td>
            <td>R1C2</td>
        </tr>
        <tr>
            <td>R2C1</td>
            <td>R2C2</td>
        </tr>
    </tbody>
</table>
<style type="text/css" id="telerik-reTable-1">
    .telerik-reTable-1   {
    border-width: 0px;
    border-style: none;
    border-collapse: collapse;
    font-family: Tahoma;
    }
    .telerik-reTable-1 td.telerik-reTableFooterEvenCol-1  {
    padding: 0in 5.4pt 0in 5.4pt;
    text-align: left;
    border-top: solid gray 1.0pt;
    }
</style>

I want it to looks like after remove inner css.

 <span style="font-family: tahoma; color: #9bbb59;">This is a simple text.</span><br />
<table>
    <thead>
    </thead>
    <tbody>
        <tr>
            <td>&nbsp;R1C1</td>
            <td>R1C2</td>
        </tr>
        <tr>
            <td>R2C1</td>
            <td>R2C2</td>
        </tr>
    </tbody>
</table>

I used this pattern @"<\s*style[^(style>)]*style>". But it's not working.

Note: I think I cann't use HtmlDocument to remove child node. Because it does not maintain parent child node relationship. so I want to use regular expression to remove the CSS.

3 Answers 3

3

You should not use regex to parse HTML documents. Check this question to understand why.

RegEx match open tags except XHTML self-contained tags

You should do it with HTML Parser, like Html Agility Pack. Here how you can do it.

        HtmlDocument doc = new HtmlDocument();
        doc.LoadHtml(htmlInput);

        var nodes = doc.DocumentNode.SelectNodes("//style");

        foreach (var node in nodes)
            node.ParentNode.RemoveChild(node);

        string htmlOutput = doc.DocumentNode.OuterHtml;
Sign up to request clarification or add additional context in comments.

1 Comment

@corei11 find the span element which style is equal to this, set the style attribute to ""
1

Use System.Xml.Xsl.XslTransform with an style sheet like this:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>
  <xsl:template match="style" />
</xsl:stylesheet>

Comments

1

Use this pattern to match.

<style[^<]*</style\s*>

Explanation:

  • <style match < and style word.
  • [^<]* match any character which is not < and this match occur multiple time till < occur.
  • </ match exactly </.
  • style\s*> match style word, zero or more space character after it and >.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.