1

I have an XML file like the following.

<div class="time">
   <span class="title">Bla: </span>
   <span class="value">Thu 20 Jan 11</span>
</div>

How can I get value "Thu 20 Jan 11" with C#? thanks in advance

2
  • 1
    are those the names of the xml nodes <div> <span> ? Commented Jan 20, 2011 at 12:47
  • 1
    This sounds more like HTML to me... Commented Jan 20, 2011 at 12:55

8 Answers 8

1

Sounds like you rather need an HTML Parser IMHO. if so, then Take a look at Html Agility Pack

Sign up to request clarification or add additional context in comments.

1 Comment

The problem is that I am trying to parse an Atom RSS feed which has nested html code like the above:(
1

Given that you do have an XML file like you say, then you need could load the file into an XmlDocument and find what you want using XPath:

class Program
    {
        static void Main(string[] args)
        {
            var xml = "<div class=\"time\">" +
                        "<span class=\"title\">Bla: </span>" +
                        "<span class=\"value\">Thu 20 Jan 11</span>" +
                        "</div>";
            var document = new XmlDocument();

            try
            {
                document.LoadXml(xml);
            }
            catch (XmlException xe)
            {
                // Handle and/or re-throw
                throw;
            }

            var date = document.SelectSingleNode("//span[@class = 'value']").InnerText;

            Console.WriteLine(date);

            Console.ReadKey();
        }
    }

Output: Thu 20 Jan 11

2 Comments

Or if you need to be more specific and requiring span to be in a div with that specific class then use XPath: //div[@class='time']/span[@class='value']
Hi, i've tried your code and it working perfectly as it is .... but when I am trying to load an Atom RSS feed which has nested html ... it throws "Data at the root level is invalid. Line 1, position 1."
0

I wrote a little snippet, which does it for you...

public void Test(String source)
{

XElement elem = XElement.Parse(source);

var query = (from x in elem.Descendants("span") select x.Value).LastOrDefault();

Console.WriteLine(query.ToString());
}

2 Comments

When I ran your code it gives me an "Data at the root level is invalid. Line 1, position 1."
Tested in LinqPad with query.Dump() and works just fine. Don`t forget to escape these '"' in the source-String, like this "<div class=\"time\">...
0

Using XPath queries may be an elegant solution too. See this knowledge base article for a brief how-to: http://support.microsoft.com/kb/308333

This of course requires the document to be strictly correct XML, which XHTML is. Unfortunately HTML input often contains syntax errors...

Cheers, Matthias

Comments

0

As said you could parse it as HTML.

However treating it as a XML document you can read the value from the node by using the XPath: /div/span[@class="value"]

You can also use XDocument to select a node value from a known XPath or by searching through descendant nodes. Using LINQ this becomes very easy to match on attribute value. Link here

Comments

0

This is as sgrassie's answer but using linq to xml, I like more this code, but is up to you.

string xml = "<div class=\"time\"><span class=\"title\">Bla: </span><span class=\"value\">Thu 20 Jan 11</span></div>";
StringReader sr = new StringReader(xml);
XDocument xdoc = XDocument.Load(sr);
var date = xdoc.Element("div").Elements("span").Where(m => ((string)m.Attribute("class")) == "value").FirstOrDefault();
Console.WriteLine(date.Value);
Console.ReadLine();

1 Comment

- <feed xmlns="w3.org/2005/Atom"> <updated>2011-01-20T08:33:23Z</updated> <title type="html">grgrgr</title> - <entry> <title type="html">Blog post : Estiatoria</title> - <content type="xhtml"> - <div xmlns="w3.org/1999/xhtml"> - <div class="due"> <span class="title">Due:</span> <span class="value">20 Jan 11</span> </div> </content> </entry> </feed>
0

Below is the code in VTD-XML:

  VTDGen vg = new VTDGen();
  System.Text.Encoding eg = System.Text.Encoding.GetEncoding("UTF-8");
    String XML = "<div class=\"time\">" +                         
                 "<span class=\"title\">Bla: </span>" +                     
                 "<span class=\"value\">Thu 20 Jan 11</span>" +                     
                 "</div>";
    vg.setDoc(eg.GetBytes(XML));
    vg.parse(true);
    VTDNav vn = vg.getNav();
    AutoPilot ap = new AutoPilot(vn);
    ap.selectXPath("/div/span[@class='value']/text()");
    int i = ap.evalXPath();
    if (i!=-1)
        Console.WriteLine(vn.toString(i));

1 Comment

<pre><code> <feed xmlns="w3.org/2005/Atom"> <updated>2011-01-20T08:33:23Z</updated> <title type="html">grgrgr</title> - <entry> <title type="html">Blog post : Estiatoria</title> - <content type="xhtml"> - <div xmlns="w3.org/1999/xhtml"> - <div class="due"> <span class="title">Due:</span> <span class="value">20 Jan 11</span> </div> </content> </entry> </feed></pre></code>
0

Ok, guyes I am putting the fragment of the code. The problem is that when I use XPath: //@* I get all the list correctly. Also I tried //@class and it returned the all the class values - OK. But when I put //span[@class='value'] i got blank list. Also I've tried several variations and it seems that when I put attribute equal to something //title[@type='html'] I am getting blank list.

<feed xmlns="w3.org/2005/Atom">
  <updated>2011-01-20T08:33:23Z</updated>
  <title type="html">grgrgr</title>
  <entry>
    <title type="html">Blog post : Estiatoria</title>
    <content type="xhtml">
      <div xmlns="w3.org/1999/xhtml">
        <div class="due">
          <span class="title">Due:</span>
          <span class="value">20 Jan 11</span>
        </div>
      </div>
    </content>
  </entry>
</feed>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.