Itextsharp HTMLWorker.Parse error

Question

I have a problem with HTMLWorker.Parse From iTextSharp in a Windows Form program. Everytime when I excecute the code and it starts with the HTMLWorker.Parse, it gives the objectDisposedException. The exception says that it cannot access a closed file. But I checked many times and cannot find the file that's closed. Here is the code:

class HtmlToPdfConverter
 {
             private iTextSharp.text.Document doc = new iTextSharp.text.Document();

     public HtmlToPdfConverter()
     {
        this.doc.SetPageSize(PageSize.A4);

     }

     public string Run(string html, string pdfName)
     {
        try
        {
            using (doc)
            {
                StyleSheet styles = new StyleSheet();
                using (PdfWriter writer = PdfWriter.GetInstance(this.doc, new     FileStream(@"Z:\programs\" + pdfName + ".pdf", FileMode.Create)))
                {
                    this.doc.Open();
                    this.doc.OpenDocument();
                    this.doc.NewPage();
                    if (this.doc.IsOpen() == true)
                    {
                        StringReader reader = new StringReader(html);
                        //XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, reader);
                        this.doc.Add(new Paragraph(" "));
                        HTMLWorker worker = new HTMLWorker(this.doc);
                        worker.Open();
                        worker.StartDocument();
                        worker.NewPage();
                        worker.Parse(reader);
                        worker.SetStyleSheet(styles);

                        List<IElement> ie = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(reader, null);

                        foreach (IElement element in ie)
                        {
                            this.doc.Add((IElement)element);
                        }

                        worker.EndDocument();
                        worker.Close();
                    }
                }
            }
            return string.Empty;
        }
        catch (Exception ex)
        {
            return ex.Message;
        }

    }
 }

This is the exception:

System.ObjectDisposedException was caught
  Message=Cannot access a closed file.
  Source=mscorlib
  ObjectName=""
  StackTrace:
       at System.IO.__Error.FileNotOpen()
       at System.IO.FileStream.Write(Byte[] array, Int32 offset, Int32 count)
       at iTextSharp.text.pdf.OutputStreamCounter.Write(Byte[] buffer, Int32 offset, Int32 count)
       at iTextSharp.text.pdf.PdfIndirectObject.WriteTo(Stream os)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber, Boolean inObjStm)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, Int32 refNumber)
       at iTextSharp.text.pdf.PdfWriter.PdfBody.Add(PdfObject objecta, PdfIndirectReference refa)
       at iTextSharp.text.pdf.PdfWriter.AddToBody(PdfObject objecta, PdfIndirectReference refa)
       at iTextSharp.text.pdf.Type1Font.WriteFont(PdfWriter writer, PdfIndirectReference piref, Object[] parms)
       at iTextSharp.text.pdf.FontDetails.WriteFont(PdfWriter writer)
       at iTextSharp.text.pdf.PdfWriter.AddSharedObjectsToBody()
       at iTextSharp.text.pdf.PdfWriter.Close()
       at iTextSharp.text.DocWriter.Dispose()
       at WebPageExtraction.HtmlToPdfConverter.Run(String html, String pdfName)
  InnerException:

Better switch to iText7 and refer this stackoverflow.com/a/57251780/14784590 — Reejesh PK
– Reejesh PK, Commented Jul 4, 2023 at 7:04

score 5 · Accepted Answer · 2012-09-03 11:58:19Z

5

You are trying to call the close methods after it's already disposed.

You have a using block which is disposing the object automatically, so just remove those two lines:

doc.CloseDocument();
doc.Close();

If you don't trust the internal dispose code to properly close the document and want to do that yourself anyway, do it inside the using block:

using (doc)
{
    StyleSheet styles = new StyleSheet();
    using (PdfWriter writer = PdfWriter.GetInstance(this.doc, new     FileStream(@"Z:\programs\" + pdfName + ".pdf", FileMode.Create)))
    {
        //.....
    }
    doc.CloseDocument();
    doc.Close();
}

Edit: after trying your code for myself I noticed some more problems and found the real reason for the error you got:

You are closing and disposing the global object doc and never creating new instance.
You don't dispose of all objects, which might lead to memory leak or locked file.
The error you got was because by default, the PdfWriter is closing the Stream it's using and when disposed, the writer is trying to use this stream. So to solve this, you have to close the stream yourself and tell the writer to not do it.

Complete fixed code:

Document doc = new Document();
StyleSheet styles = new StyleSheet();
string filePath = @"Z:\programs\" + pdfName + ".pdf";
using (FileStream pdfStream = new FileStream(filePath, FileMode.Create))
{
    using (PdfWriter writer = PdfWriter.GetInstance(doc, pdfStream))
    {
        writer.CloseStream = false;
        doc.Open();
        doc.OpenDocument();
        doc.NewPage();
        if (doc.IsOpen() == true)
        {
            using (StringReader reader = new StringReader(html))
            {
                //XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, reader);
                doc.Add(new Paragraph(" "));
                using (HTMLWorker worker = new HTMLWorker(doc))
                {
                    worker.Open();
                    worker.StartDocument();
                    worker.NewPage();
                    worker.Parse(reader);
                    worker.SetStyleSheet(styles);
                    List<IElement> ie = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(reader, null);
                    foreach (IElement element in ie)
                    {
                        doc.Add((IElement)element);
                    }
                    worker.EndDocument();
                    worker.Close();
                }
            }
        }
        writer.Close();
    }
}

doc.CloseDocument();
doc.Close();
doc.Dispose();

edited Sep 3, 2012 at 11:58

answered Sep 3, 2012 at 10:21

user447356

Sign up to request clarification or add additional context in comments.

8 Comments

Emon Over a year ago

i added those doc.close and .closeDocument as extra to look if that was going to work. I have tried your solution, but it still doesn't work. Thank you for helping.

user447356 Over a year ago

Yes, found the real reason. See my edit. The critical change is adding writer.CloseStream = false;

Emon Over a year ago

Now it gives a other exception. it is the webexception. it says that it cannot find the networkpath. This version also stops at worker.parse, do you know if there's something wrong with that method in iTextSharp? It doesn't give the other exception anymore. Thank you for helping me.

user447356 Over a year ago

Maybe pdfName is empty? Try hardcoding a path e.g. @"Z:\programs\myfile.pdf" and see if it work.

user447356 Over a year ago

Sorry, no more ideas - try different path then e.g. C:\Temp\myfile.pdf

|

Collectives™ on Stack Overflow

Itextsharp HTMLWorker.Parse error

1 Answer 1

8 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

8 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related