0

I want to replace text from a certain range in my HTML file (like from position 1000 to 200000) with text from another HTML file. Can someone recommend me the best way to do this?

3
  • Could you be a little more a specific about the notion of position in a HTML file? Maybe provide an example of how the files look before and after. Commented Nov 7, 2010 at 8:24
  • Well character position, like IndexOf. Replace from this line to this line or this string to this string. Hope it's clear now. Commented Nov 7, 2010 at 8:26
  • Sounds risky.. what if someone change the HTML slightly? Your code might crash with unexpected problems. What is the big picture here? Commented Nov 7, 2010 at 11:45

3 Answers 3

4

Pieter's way will work, but it does involve loading the whole file into memory. That may well be okay, but if you've got particularly large files you may want to consider an alternative:

  • Open a TextReader on the original file
  • Open a TextWriter for the target file
  • Copy blocks of text by calling Read/Write repeatedly, with a buffer of say 8K characters until you've read the initial amount (1000 characters in your example)
  • Write the replacement text out to the target writer by again opening a reader and copying blocks
  • Skip the text you want to ignore in the original file, by repeatedly reading into a buffer and just ignoring it (incrementing a counter so you know how much you've skipped, of course)
  • Copy the rest of the text from the original file in the same way.

Basically it's just lots of copying operations, including one "copy" which doesn't go anywhere (for skipping the text in the original file).

Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

string input = File.ReadAllText("<< input HTML file >>");
string replacement = File.ReadAllText("<< replacement HTML file >>");

int startIndex = 1000;
int endIndex = 200000;

var sb = new StringBuilder(
    input.Length - (endIndex - startIndex) + replacement.Length
);

sb.Append(input.Substring(0, startIndex));
sb.Append(replacement);
sb.Append(input.Substring(endIndex));

string output = sb.ToString();

1 Comment

This did not worked for me , My file was not replaced . Then i used File.WriteAllText("html file","replace string")
-1

The replacement code Pieter posted does the job, and using the StringBuilder with the known resulting length is a clever way to save performance.

Should do what you asked, but sometimes when working with structured data like html, it is preferable to load it as XML (I have used the HtmlAgilityPack for that). Then you could use XPath to find the node you want to replace, and work with it. It might be slower, but as I said, you can work with the structure then.

2 Comments

Use a StringBuilder instead. With this, you can pre-reserve memory with the constructor. With your example, you are first constructing a string from the first Substring + myReplacement, and then a second string from this string and the second Substring. StringBuilder` is a lot more efficient.
I was actually going to try out this method but I knew there would be a better way of doing it so I asked here. Pieter, I will try out your code and mark your answer as correct if it works. I totally forgot about StringBuilder though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.