0

I have the following XML document which I would like to parse into a DataSet.

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?> 
<Response Status="OK">
  <Item>
    <Field Name="ID">767147519</Field> 
    <Field Name="Name">Music</Field> 
    <Field Name="Path">Family\Music</Field> 
    <Field Name="Type">Playlist</Field> 
  </Item>
</Response>

I am wanting to get the attribute values for ID, Name, and Path.

The following is what I have attempted:

Dim loaded As XDocument = XDocument.Load(uriString)
Dim name = From c In loaded.Descendants("Item") Select c
For Each result In name
  Dim str1 = result.Attribute("ID").Value 'Returns Nothing and causes a validation error
  Dim str2 = result.Value ' Returns all the attribute values in one long string (ie "767147519MusicFamilyPlaylist")
Next

Any help would be greatly appreciated.

Thanks,

Matt

EDIT:

Following one of the answers below, I have been attempting to implement an anonymous type in my Linq, however I keep encountering the error

Object reference not set to an instance of an object.

My updated code is as follows:

Dim name = From c In loaded.Descendants("Item") Select c Select sID = c.Element("Field").Attribute("Name").Value, sName = c.Attribute("ID").Value.FirstOrDefault
Dim Id As String = String.Empty
For Each result In name
  Id = result.sID
Next

I think this error means that the attribute ("ID") cannot be located, so I have attempted several variations of this with similar results.

Is anyone able to identify where I am going wrong and point me in the right direction.

Thanks,

Matt

3
  • I have updated my question to show my attempt to implement an anonymous type, I am encountering an Object referrence error (see updated question), any help in identifying the cause of this error would be greatly appreciated. Commented Jan 22, 2011 at 14:16
  • Also, added a new tag, the application is in VB.Net so examples in VB would be appreciated, but any help is great. Commented Jan 22, 2011 at 14:17
  • You say that you think this error means that the attribute "ID" cannot be located, but the problem is that there is no attribute "ID". There is an element called "Field" with an attribute called "Name". The Name attribute has a value of "ID". Your code is trying to do the wrong thing. You need to get the .Value of the Field element where .Attribute("Name").Value == "ID". Not the value of the "ID" attribute, because there's no such thing. Commented Jan 26, 2011 at 22:55

12 Answers 12

2

You can use XPath:

Dim data = From item In loaded.Descendants("Item")
           Select
             ID = item.XPathSelectElement("Field[@Name='ID']").Value,
             Name = item.XPathSelectElement("Field[@Name='Name']").Value,
             Path = item.XPathSelectElement("Field[@Name='Path']").Value,
             Type = item.XPathSelectElement("Field[@Name='Type']").Value

(Be sure to import the System.Xml.XPath namespace)

Or to add it directly to a DataTable:

Dim dt As New DataTable()
dt.Columns.Add("ID")
dt.Columns.Add("Name")
dt.Columns.Add("Path")
dt.Columns.Add("Type")
For Each item In loaded.Descendants("Item")
  dt.Rows.Add(
    item.XPathSelectElement("Field[@Name='ID']").Value,
    item.XPathSelectElement("Field[@Name='Name']").Value,
    item.XPathSelectElement("Field[@Name='Path']").Value,
    item.XPathSelectElement("Field[@Name='Type']").Value
  )
Next
Sign up to request clarification or add additional context in comments.

Comments

1

Another one solution with anonymous types:

        var doc = XDocument.Load("c:\\test");

        var list = doc.Root
         .Elements("Item")
         .Select(item =>
          new
          {
              Id = item.Elements("Field").Where(e => e.Attribute("Name").Value == "ID").Select(e => e.Value).FirstOrDefault(),
              Path = item.Elements("Field").Where(e => e.Attribute("Name").Value == "Path").Select(e => e.Value).FirstOrDefault(),
              Name = item.Elements("Field").Where(e => e.Attribute("Name").Value == "Name").Select(e => e.Value).FirstOrDefault(),
          })
         .ToArray();

        foreach (var item in list)
        {
            var id = item.Id;
            var name = item.Name;
        }

Ugly expression inside new operator can be shorted with next anonymous function:

Func<XElement, string, string> getAttrValue = (node, attrName) =>
{
 return node.Elements("Field")
  .Where(e => e.Attribute("Name").Value == attrName)
  .Select(e => e.Value)
  .FirstOrDefault();
};

Then new operator looks like:

 new 
 { 
  Id = getAttrValue(item, "ID"), 
  Path = getAttrValue(item, "Path"),
  Name = getAttrValue(item, "Name"),
 }

Comments

1

Here is my attempt at solution to your problem. I just noticed that you wish to go with as much LINQ as possible so I've structured my LINQ query accordingly. Please note result type (for "IDs") will be IEnumerable() i.e. you will need to run a for each loop on it to get individual ids even with a single item:

Dim loaded As XDocument = XDocument.Load(uriString)

Dim IDs = From items In loaded.Descendants("Item") _
         Let fields = items.Descendants("Field") _
         From field In fields _
         Where field.Attribute("Name").Value = "ID" _
         Select field.Value

On a side note: For future reference, if you run into C# anonymous type "var" in examples, the equivalent in vb is plain dim like in my query above (without the 'as type' part).

Hope this helps. Maverik

Comments

1

Use XPath and save everyone the headaches?

XmlDocument xml = new XmlDocument();
xml.Load(xmlSource);

string id = xml.SelectSingleNode("/Response/Item/Field[@Name='ID']").InnerText;
string name = xml.SelectSingleNode("/Response/Item/Field[@Name='Name']").InnerText;
string path = xml.SelectSingleNode("/Response/Item/Field[@Name='Path']").InnerText;

Comments

0

I am wanting to get the attribute values for ID, Name, and Path.

If you don't mind using something else than XDocument i'd just use a XmlDocument:

        XmlDocument doc = new XmlDocument();
        doc.Load(new XmlTextReader("XData.xml"));
        XmlNodeList items = doc.GetElementsByTagName("Item");
        foreach (XmlElement item in items.Cast<XmlElement>())
        {
            XmlElement[] fields = item.GetElementsByTagName("Field").Cast<XmlElement>().ToArray();
            string id = (from s in fields where s.Attributes["Name"].InnerText == "ID" select s).First().InnerText;
            string name = (from s in fields where s.Attributes["Name"].InnerText == "Name" select s).First().InnerText;
            string path = (from s in fields where s.Attributes["Name"].InnerText == "Path" select s).First().InnerText;

            //Do stuff with data.
        }

Performance-wise this might be abysmal. You could also have a loop on the Fields and then use a switch on the Name-Attribute so you don't check the same field more than once. Why would you need any linq for this anyway?


        XmlDocument doc = new XmlDocument();
        doc.Load(new XmlTextReader("XData.xml"));
        XmlNodeList items = doc.GetElementsByTagName("Item");
        foreach (XmlElement item in items.Cast<XmlElement>())
        {
            foreach (XmlNode field in item.GetElementsByTagName("Field"))
            {
                string name = field.Attributes["Name"].InnerText;
                switch (name)
                {
                    case "ID":
                        string id = field.InnerText;
                        //Do stuff with data.
                        break;
                    case "Path":
                        string path = field.InnerText;
                        //Do stuff with data.
                        break;
                    case "Name":
                        string name = field.InnerText;
                        //Do stuff with data.
                        break;
                    default:
                        break;
                }
            }
        }

2 Comments

You might want to use Linq because it is much more concise and readable ;-)
I would preffer to use Linq, I have been told that this is a nicer way of achieving what I am attempting to do, and I seem to be picking it up fairly easy so dont want to start learning another method. Thank you though.
0

Your linq query returns all the Item elements in the document:

Dim name = From c In loaded.Descendants("Item") Select c

The code that follows is trying to obtain an 'ID' attribute from the 'Item' element:

Dim str1 = result.Attribute("ID").Value

However, the 'ID' attribute is on a 'Field' child element.

What you need is the following:

// find all the Item elements
var items = loaded.Descendants("Item");
foreach(var item in items)
{
  // find all the Field child elements
  var fields = item.Descendants("Field");

  // find the field element which has an ID attribute, and obtain the element value  
  string id = fields.Where(field => field.Attribute("ID")!=null)
                    .Single()
                    .Value;

  // etc ...
}

Comments

0

A Simple solution is

        var result = doc.Root.Descendants(XName.Get("Item")).Select(x =>  x.Descendants(XName.Get("Field")));


        foreach (var v in result)
        {
            string id = v.Single(x => x.Attribute(XName.Get("Name")).Value == "ID").Value;

            string name = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Name").Value;

            string path = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Path").Value;

            string type = v.Single(x => x.Attribute(XName.Get("Name")).Value == "Type").Value;

        }

It can be easily converted in to vb code.

1 Comment

How is "string id = v.Single(x => x.Attribute(XName.Get("Name")).Value == "ID").Value;" converted to VB.NET? I haven't come across how to convert this type of C# coding.
0

Here is a generic solution that handles all fields with different field names in several items. It saves the result in one table containing all distinct field names as column names.

Module Module1

Function createRow(ByVal table As DataTable, ByVal item As XElement) As DataRow
    Dim row As DataRow = table.NewRow

    Dim fields = item.Descendants("Field")
    For Each field In fields
        row.SetField(field.Attribute("Name").Value, field.Value)
    Next

    Return row

End Function


Sub Main()
    Dim doc = XDocument.Load("XMLFile1.xml")

    Dim items = doc.Descendants("Item")

    Dim columnNames = From attr In items.Descendants("Field").Attributes("Name") Select attr.Value

    Dim columns = From name In columnNames.Distinct() Select New DataColumn(name)

    Dim dataSet As DataSet = New DataSet()
    Dim table As DataTable = New DataTable()
    dataSet.Tables.Add(table)

    table.Columns.AddRange(columns.ToArray())

    Dim rows = From item In items Select createRow(table, item)

    For Each row In rows
        table.Rows.Add(row)
    Next

    ' TODO Handle Table
End Sub

End Module

I tried to use as much Linq as possible, but Linq is a bit inflexible when it comes to handling nested elements recursively.

Heres the sample xml file I've used:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<Response Status="OK">
  <Item>
    <Field Name="ID">767147519</Field>
    <Field Name="Name">Music</Field>
    <Field Name="Path">Family\Music</Field>
    <Field Name="Type">Playlist</Field>
  </Item>
  <Item>
    <Field Name="ID">123</Field>
    <Field Name="Name">ABC</Field>
    <Field Name="RandomFieldName">Other Value</Field>
    <Field Name="Type">FooBar</Field>
  </Item>
</Response>

And the result:

ID         Name     Path          Type        RandomFieldName

767147519  Music    Family\Music  Playlist

123        ABC                    FooBar      Other Value

Comments

0

After some further research and with the assistance of parts from the answers provided, I have come up with the following, which returns the information that I am after.

Dim Query = From items In loaded.Descendants("Item") _   
Let sID = ( From q In items.Descendants("Field") _       
Where q.Attribute("Name").Value = "ID" ) _ 
Let sName = ( From r In items.Descendants("Field") _       
Where r.Attribute("Name").Value = "Name" ) _ 
Let sPath = ( From s In items.Descendants("Field") _       
Where s.Attribute("Name").Value = "Path" ) _ 
Where (Ctype(sPath.Value,String) Like "Family\*") _
Select pId=sID.Value, pName=sName.Value, pPath = sPath.Value

If this can be improved in any way to enable better performance, please let me know.

Thank you all for your assistance, while no one answer was able to entirely solve the problem I was able to learn a great deal about Linq through everyones assistance.

Matt

Comments

0

I hope you expected something like this short answer and not another implementation:

Dim items = From c In loaded.Descendants("Item") Select c (...)

Ok so far nothing should run into any trouble. The variable name 'name' was a bit confusing, so I changed it to 'items'.

The second part contains the error:

Dim items = (...) Select sID = c.Element("Field").Attribute("Name").Value, sName = c.Attribute("ID").Value.FirstOrDefault

The following works because there is an Attribute called Name, although the result is 'ID' what shurely wasn't expected:

c.Element("Field").Attribute("Name").Value

Here comes the error:

c.Attribute("ID").Value.FirstOrDefault

c is the XmlNode '< Item > ... < / Item >' and it does not have any attributes, thus the result of c.Attribute("ID") is null.

I guess you wanted something like the following:

Dim loaded = XDocument.Load("XMLFile1.xml")
Dim items = From item In loaded.Descendants("Item") Select _
            sID = (From field In item.Descendants("Field") _
                   Where field.Attribute("Name") = "ID" _
                   Select field.Value).FirstOrDefault() _
            , _
            sName = (From field In item.Descendants("Field") _
                     Where field.Attribute("Name") = "Name" _
                     Select field.Value).FirstOrDefault()

Comments

0

There are a few errors in your code:

You should get the Descendents that have the XName equal to Field instead of to Item

Dim name = From c In loaded.Descendants("Field") Select c

The attribute you are after is called Name, not ID

Dim str1 = result.Attribute("Name").Value

At the first iteration of your for each str1 will be "ID", the next one it will be "Name", etc.

Total code:

Dim loaded As XDocument = XDocument.Load(uriString)
Dim name = From c In loaded.Descendants("Field") Select c
For Each result In name
  Dim str1 = result.Attribute("Name").Value 'Returns "ID"
  Dim str2 = result.Value ' Returns "767147519"
Next

Comments

-1

There's another way to fix this problem. Transform this XML into the format that the DataSet wants, and then load it using DataSet.ReadXml. This is something of a pain if you don't know XSLT. But it's really important to know XSLT if you work with XML.

The XSLT you'd need is pretty simple. Start with the XSLT identity transform. Then add a template that transforms the Response and Item elements into the format that the DataSet expects:

<xsl:template match="Response">
   <MyDataSetName>
      <xsl:apply-templates select="Item"/>
   </MyDataSetName>
</xsl:template>

<xsl:template match="Item">
   <MyDataTableName>
      <xsl:apply-templates select="Field[@Name='ID' or @Name='Name' or @Name='Path']"/>
   </MyDataTableName>
</xsl:template>

<xsl:template match="Field">
   <xsl:element name="{@Name}">
      <xsl:value-of select="."/>
   </xsl:element>
</xsl:template>

That will change your XML to a document that looks like this:

<MyDataSetName>
  <MyDataTableName>
    <ID>767147519</ID> 
    <Name>Music</Name> 
    <Path>Family\Music</Path> 
  </MyDataTableName>
</MyDataSetName>

...and you can just feed that to DataSet.ReadXml.

Edit:

I should point out, since it's not obvious unless you do this a lot, that one effect of this is that the amount of C# code that you need to create and populate the DataSet is minimal:

    private DataSet GetDataSet(string inputFilename, string transformFilename)
    {
        StringBuilder sb = new StringBuilder();
        using (XmlReader xr = XmlReader.Create(inputFilename))
        using (XmlWriter xw = XmlWriter.Create(new StringWriter(sb)))
        {
            XslCompiledTransform xslt = new XslCompiledTransform();
            xslt.Load(transformFilename);
            xslt.Transform(xr, xw);
        }
        using (StringReader sr = new StringReader(sb.ToString()))
        {
            DataSet ds = new DataSet();
            ds.ReadXml(sr);
            return ds;
        }
    }

It's also reusable. You can use this method to populate as many different DataSets from as many different possible input formats as you need; you just need to write a transform for each format.

6 Comments

Using XSLT to transform the document into something that can be read by a DataSet seems a bit OTT when you can achieve the same thing with a very simple bit of Linq, or XmlDocument!
If you're conversant with XSLT, as anyone who works with XML ought to be, the above is trivial. Also, the "very simple bit of Linq" provided in these answers just populates variables; it doesn't actually create DataRow objects, add them to the right DataTable, etc. I'm not saying my answer is therefore superior - I've omitted the code that executes the transform, for instance. But it does replace all of the ADO code with a single method call, and reduces the amount of C# code required to a minimal amount.
+1 I'd go for loading transformed Xml too. The Xslt could be even simpler.
Just <xsl:template match="Field[@Name]"> and you can reuse your xslt for any <Field /> your source Xml may contain.
@Colin This solution is not over-engineered. Handling your Xml this way you won't have to change any code if field names change in your Xml. And you don't have to write any code to add the table and rows to your DataSet.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.