How to build XmlNodes from XmlReader

Question

I am parsing a big number of big files and after profiling my bottleneck is:

XmlDocument doc = new XmlDocument();
doc.Load(filename);

This approach was very handy because I could extract nodes like this:

XmlNodeList nodeList = doc.SelectNodes("myXPath");

I am switching to XmlReader, but When I find the element I need to extract I am stuck with regards to how to build a XmlNode from it as not too familiar with XmlReader:

XmlReader xmlReader = XmlReader.Create(fileName);

while (xmlReader.Read())
{
   //keep reading until we see my element
   if (xmlReader.Name.Equals("myElementName") && (xmlReader.NodeType == XmlNodeType.Element))
   {
       // How do I get the Xml element from the reader here?
   }
}

I'd like to be able to build a List<XmlNode> object. I am on .NET 2.0.

Any help appreciated!

executor · Accepted Answer · 2010-04-15 13:38:48Z

20

Why not just do the following?

XmlDocument doc = new XmlDocument();
XmlNode node = doc.ReadNode(reader);

answered Apr 15, 2010 at 13:38

executor

4915 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

John Saunders Over a year ago

Already answered. See stackoverflow.com/questions/1566192/….

Karl Cassar Over a year ago

This is the correct answer as the other one leaves blank nodes!

Fredrik Mörk · Accepted Answer · 2009-10-14 13:30:25Z

7

The XmlNode type does not have a public constructor, so you cannot create them on your own. You will need to have an XmlDocument that you can use to create them:

XmlDocument doc = new XmlDocument();
while (xmlReader.Read())
{
    //keep reading until we see my element
    if (xmlReader.Name.Equals("myElementName") && (xmlReader.NodeType == XmlNodeType.Element))
    {
        // How do I get the Xml element from the reader here?
        XmlNode myNode = doc.CreateNode(XmlNodeType.Element, xmlReader.Name, "");
        nodeList.Add(myNode);
    }        
}

answered Oct 14, 2009 at 13:30

Fredrik Mörk

159k31 gold badges296 silver badges349 bronze badges

4 Comments

JohnIdol Over a year ago

it seems to be creating empty nodes?

Fredrik Mörk Over a year ago

Yes, unless you add anything to the elements (by assigning something to the InnerText property for instance) they will be empty.

JohnIdol Over a year ago

oh yep - looks obvious now since I am just passing element name in, thanks

Karl Cassar Over a year ago

This results in only empty nodes. You can use doc.ReadNode(reader) to actually get the entire node as XmlNode

Abel · Accepted Answer · 2009-10-14 13:32:17Z

XmlReader and XmlDocument have a very distinct way of processing. XmlReader keeps nothing in memory and uses a forward-only approach as opposed to building a full DOM tree in memory for XmlDocument. It is helpful when performance is an issue, but it also requires you to write your application differently: instead of using XmlNode, you don't keep anything and only process "on the go": i.e., when an element passes by that you need, you do something. This is close to the SAX approach, but without the callback model.

The answer to "how to get the XmlElement" is: you'll have to build them from scratch based on the info from the reader. This, unfortunately, defies the performance gain. It is often better to prevent using DOM approaches altogether once you switch to XmlReader, unless for a few distinct cases.

Also, the "very handy" way to extract nodes using XPath (SelectNodes is what you show above) cannot be used here: XPath requires a DOM tree. Consider this approach a filtering approach: you can add filters to the XmlReader and tell it to skip certain nodes or read until a certain node. This is extremely fast, but a different way of thinking.

Gilad Green · Accepted Answer · 2016-08-08 13:05:51Z

4

Use XmlDocument.ReadNode for this approach. Put XmlReader in using statement and use XmlReader.LocalName instead of Name to remove namespace prefix.

edited Aug 8, 2016 at 13:05

Gilad Green

37.3k7 gold badges67 silver badges99 bronze badges

answered Oct 14, 2009 at 13:46

m3kh

7,9692 gold badges34 silver badges38 bronze badges

Comments

Gilad Green · Accepted Answer · 2016-08-08 12:57:11Z

1

I've used the following workaround when I've had to insert data from a XmlReader into a XmlDocumenht:

XmlReader rdr = cmd.ExecuteXmlReader();

XmlDocument doc = new XmlDocument();

// create a container node for our resultset
XmlElement root = doc.CreateElement("QueryRoot");
doc.AppendChild(root);

StringBuilder xmlBody = new StringBuilder();

while(rdr.Read())
{
    xmlBody.Append(rdr.ReadOuterXml());
}

root.InnerXml = xmlBody.ToString();

edited Aug 8, 2016 at 12:57

Gilad Green

37.3k7 gold badges67 silver badges99 bronze badges

answered Nov 3, 2009 at 17:45

Barney Light

111 bronze badge

Comments

Gilad Green · Accepted Answer · 2016-08-08 13:06:55Z

Here is my approach:

public static IEnumerable<XmlNode> StreamNodes(
    string path,
    string[] tagNames) 
{            
    var doc = new XmlDocument();            
    using (XmlReader xr = XmlReader.Create(path)) 
    {
        xr.MoveToContent();
        while (true) {
            if (xr.NodeType == XmlNodeType.Element &&
                tagNames.Contains(xr.Name)) 
            {
                var node = doc.ReadNode(xr);
                yield return node;
            } 
            else 
            {
                if (!xr.Read()) 
                {
                    break;
                }
            }
        }
        xr.Close();
    }                        
}
// Used like this:
foreach (var el in StreamNodes("orders.xml", new string[]{"order"})) 
{
    ....
}

The nodes can then be imported into another document for further processing.

Collectives™ on Stack Overflow

How to build XmlNodes from XmlReader

6 Answers 6

2 Comments

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

4 Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related