2

I have a browser control embeded in a C# windows app. I want to grab the rendered HTML (which could have been modified by javascript) not the original one.

Any suggestions?

3 Answers 3

7

You can get the HTML, and indeed set it, with WebBrowser.DocumentText.

Sheng is correct, DocumentText returns the streamed document before scripts run. His code doesn't compile, but it's essentially correct. I found that you need:

mshtml.HTMLDocument doc = webBrowser1.Document.DomDocument as mshtml.HTMLDocument;
string html = doc.documentElement.outerHTML;
Sign up to request clarification or add additional context in comments.

2 Comments

Also make sure to only retrieve the document text after the DocumentCompleted event fires (otherwise there's a race condition)
for using mshtml; you need to add Microsoft.mshtml to your project refrences.
6

DocumentText internally use the document's IPersistStream interface which returns the original HTML. Use webBrowser1.Document.DocumentElement.OuterHTML instead.

1 Comment

+1 Thanks for this. I've corrected my answer. I would have deleted it, but you can't delete an accepted answer!
1

Add a Navigated event to your WebBrowser. Only then will your document be filled.

    private void webBrowser1_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        Console.WriteLine(webBrowser1.DocumentText);
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.