1

I have following code snippet, The search criteria would be find all img tag that has id="someImage" value

<img id="someImage" src="C:\logo.png" height="64" width="104" alt="myImage" />

I want to replace

src="C:\logo.png" to src="someothervalue"

so the final output would be

<img id="someImage" src="C:\someothervalue" height="64" width="104" alt="myImage" />

How can i achieve this using regex.

Thank you.

2 Answers 2

1

You can work with groups in a regex. You create groups by using parentheses in your regular expression. When you get a Match object, this object will contain a Group collection:

string input = "<html><img id=\"someImage\" src=\"C:\\logo.png\" height=\"64\" width=\"104\" alt=\"myImage\" /></html>";
var regex = new Regex("(<img(.+?)id=\"someImage\"(.+?))src=\"([^\"]+)\"");

string output = regex.Replace(
    input, 
    match => match.Groups[1].Value + "src=\"someothervalue\""
);

In the example above there will be 5 groups:

  • Groups[0] This is the whole match: <img id=\"someImage\" src=\"C:\\logo.png\"
  • Groups[1] This is everything before the src attribute: <img id=\"someImage\" 
  • Groups[2] and Groups[3] are the (.+?) parts.
  • Groups[4] is the original value of the src attribute: C:\logo.png

In the example I replace the whole match for the value of Groups[1] and a new src attribute.

Footnote: While regular expressions can sometimes be adequate for the job to manipulate an html document, it is often not the best way. If you know in advance that you are working with xhtml, then you can use XmlDocument + XPath. If it is html, then you can use HtmlAgilityPack.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for mentioning HtmlAgilityPack and for the answer, that worked for me.
1

It is not a good idea to use regex for XML. Depending on the language you should use some XML reader, extract the <img> node and then get its id. One useful language for querying XML data, which is supported by many XML libraries is XPath.

In C# you can look at XmlDocument class (and related classes).

Another one is XmlReader.

The latter offers only sequential access, while the first one loads the whole tree in memory, so the first one is easier to use (especially if your XML content is not too big).

1 Comment

Really? It looks like HTML to me. HTML is a special kind of XML.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.