Parsing XML in C# XML for specific Content

Question

I am trying to parse an XML response from a website in C#. The response comes in a format similar to the following:

<Company>
    <Owner>Bob</Owner>
    <Contact>
        <address> -1 Infinite Loop </address>
        <phone>
            <LandLine>(000) 555-5555</LandLine>
            <Fax> (000) 555-5556 </Fax>
        </phone>
        <email> [email protected] </email>
    </Contact>
</Company>

The only information I want is the LandLine and Fax numbers. However my current approach seems really really poor quality. Essentially it is a bunch of nested while loops and checks to the Element name then reading the Content when I found the right Element. I am using something like the listing below:

XmlReader xml = XmlReader.Create(websiteResultStream, xmlSettings);

while(xml.Read()){
    if(xml.NodeType == XmlNodeType.Element){
        if(xml.Name.ToString() == "Phone"){
            while(xml.Read()) {
                if(xml.NodeType == XmlNodeType.Element) {
                     if(xml.Name.ToString() == "LandLine"){
                          xml.MoveToContent();
                          xml.ReadContentAsString();
                     }
                     if(xml.Name.ToString() == "Fax"){
                          xml.MoveToContent();
                          xml.ReadContentAsString();
                     }
                }
            }
        }
    }
}

I am newer to XML/C#, but the above method just screams bad code! I want to ensure that if the structure changes (i.e. there are addition phone number types like "mobile") that the code is robust (hence the additional while loops)

Note: the above C# code is not exact, and lacks some checks etc, but it demonstrates my current abysmal disgusting approach

What is the best/cleanest way to simply extract the content from those two Elements if they are present?

Dirk Vollmar · Accepted Answer · 2010-08-18 14:53:27Z

8

The most light-weight approach for read-only access to specific nodes in an XML document is by using an XPathDocument together with an XPath expression:

XPathDocument xdoc = new XPathDocument(@"C:\sample\document.xml");
XPathNavigator node = xdoc.CreateNavigator()
    .SelectSingleNode("/Company/Contact/phone/LandLine");
if (node != null)
{
    string landline = node.Value;
}

answered Aug 18, 2010 at 14:53

Dirk Vollmar

177k53 gold badges261 silver badges318 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

dtb · Accepted Answer · 2010-08-18 14:45:51Z

8

Use LINQ-to-XML:

var doc = XDocument.Parse(@"<Company>
    <Owner>Bob</Owner>
    <Contact>
        <address> -1 Infinite Loop </address>
        <phone>
            <LandLine>(000) 555-5555</LandLine>
            <Fax> (000) 555-5556 </Fax>
        </phone>
        <email> [email protected] </email>
    </Contact>
</Company>");

var phone = doc.Root.Element("Contact").Element("phone");

Console.WriteLine((string)phone.Element("LandLine"));
Console.WriteLine((string)phone.Element("Fax"));

Output:

(000) 555-5555
 (000) 555-5556

answered Aug 18, 2010 at 14:45

dtb

218k37 gold badges413 silver badges438 bronze badges

2 Comments

CaffGeek Over a year ago

Note that if Contact is missing, you'll get an exception on the var phone = ... line. I like to do things like var contactNode = doc.Root.Element("Contact") ?? new XElement("Contact"); so I always have a node returned, and then when I do var phone = contact.Element("phone") ?? new XElement("phone"); I won't get null object errors. And in the end, I just end up with blank values for the variables. Or use an xsd to validate the document prior to parsing to ensure the nodes you want exist.

Dirk Vollmar Over a year ago

Note that the XDocument class also comes with the overhead of building up a DOM tree in memory; usually not what you need for read-only random access to nodes in the document, especially when you deal with large documents.

Jon Hanna · Accepted Answer · 2010-08-18 14:58:40Z

I don't think you're too far off. There are more convenient methods (lots of different approaches). Assuming you want to take the same basic approach as you do here (and it is an efficient if verbose one), I'd do:

bool inPhone = false;
string landLine = null;
string fax = null;

using(xml = XmlReader.Create(websiteResultStream, xmlSettings)
while(xml.Read())
{
  switch(xml.NodeType)
  {
    case XmlNodeType.Element:
      switch(xml.LocalName)
      {
        case "phone":
          inPhone = true;
          break;
        case "LandLine":
          if(inPhone)
          {
            landLine = xml.ReadElementContentAsString();
            if(fax != null)
            {
              DoWhatWeWantToDoWithTheseValues(landline, fax);
              return;
            }
          }
          break;
        case "Fax":
          if(inPhone)
          {
            fax = xml.ReadElementContentAsString();
            if(landLine != null)
            {
              DoWhatWeWantToDoWithTheseValues(landline, fax);
              return;
            }
          }
          break;
      }
      break;
    case XmlNodeType.EndElement:
      if(xml.LocalName == "phone")
        inPhone = false;
      break;
  }
}

Note that this tracks whether it's "inside" a Phone element where that which you have would re-examine a LandLine inside a later element, which you seem to be trying to avoid.

Note also that we clean up the XmlReader, and do so by returning as soon as we have all the information we want.

Icemanind · Accepted Answer · 2010-08-18 14:46:10Z

1

The best way to do that is to use XPath. Refer to this article, for reference: http://support.microsoft.com/kb/308333

and this article for how to do it: http://www.codeproject.com/KB/cpp/myXPath.aspx

answered Aug 18, 2010 at 14:46

Icemanind

48.9k52 gold badges182 silver badges308 bronze badges

Collectives™ on Stack Overflow

Parsing XML in C# XML for specific Content

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related