0

I am trying to retrieve data from an xml document to an array in C# using LINQ where I have to use some nested querying within the elements of the xml data which is as follows

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>    
    <Catalog>
     <Book ISBN="1.1.1.1" Genre="Thriller">
      <Title  Side="1">
       <Pty R="1" ID="Seller_ID">
         <Sub Typ="26" ID="John">
         </Sub>
       </Pty>
       <Pty R="2" ID="ABC">
       </Pty>
        </Title>
    </Book>
    <Book ISBN="1.2.1.1" Genre="Thriller">
      <Title  Side="2">
       <Pty R="1" ID="Buyer_ID">
         <Sub Typ="26" ID="Brook">
         </Sub>
       </Pty>
       <Pty R="2" ID="XYZ">
       </Pty>
        </Title>
    </Book>
    </Catalog>

In the above XML document Side="1" represents a sell side and Side="2" represents a sell side. Now, I want to store above elements and attributes in an array which as fields as follows

Array ISBN Genre PublishDate Buyer_Company Seller_Company Buyer_Broker Seller_Broker

I was able to retrieve normal elements and attributes but was not sure how to deal with attributes that are dependent on other elements like Buyer_Company Seller_Company Buyer_Broker Seller_Broker which are based on Side, Pty and Sub elements like Buyer_Company is ID attribute of Pty where R= 2 and Side=2. Similarly, Buyer_Broker is ID attribute of Sub element where its attribute Typ=26 (there can be XML data with different value of Typ) and Sub element is already a child to Pty element with R=1 and which is in turn a child of Book element when Side=2

Code I used to retrieve independent elements is

var result = doc.Descendants("Book")
        .Select(b => new
        {
            ISBN= b.Attribute("ISBN").Value,
            Genre=b.Attribute("Genre").Value,
            PublishDate= b.Element("Title").Attribute("MMY").Value,        

        })
        .ToArray();

And I worked on querying within a single element as follows

  Company= (string)b.Descendants("Pty")
                             .Where(e => (int)e.Attribute("R") == 7)
                             .Select(e => e.Attribute("ID"))
                             .Single()

But this didn't consider the attribute Side in the element Book.

Sample Data

First Book Element

ISBN:1.1.1.1
Genre:Thriller
Seller_Company:NULL
Seller_Broker:NULL
Buyer_Company:ABC
Buyer_Broker:John

Second Book Element

ISBN:1.1.1.1
Genre:Thriller
Seller_Company:XYZ
Seller_Broker:Brook
Buyer_Company: NULL
Buyer_Broker:NULL

Side=1 represent a seller side and side=2 represents a buyer side which is why seller side is null in the first element of resultant array and buyer side in second element

May I know a better way to solve this?

5 Answers 5

2

You can use Parent property to get Parent element of Pty then get the Side attribute and check it:

.Where(e => (int)e.Attribute("R") == 7 && 
            (int)e.Parent.Attribute("Side") == 2)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks but when I try to store them in an array at some point on both Side=1 and Side=2 either Buyer_Company or Seller_Company will contain no elements and in that case I just want to store NULL value
2

Edited to match the question:

Using XPath:

private static string GetCompanyValue(XElement bookElement, string side, string r)
{
  string format = "Title[@Side={0}]/Pty[@R={1}]";
  return GetValueByXPath(bookElement, string.Format(format, side, r));
}

private static string GetBrokerValue(XElement bookElement, string side)
{
  string format = "Title[@Side={0}]/Pty[@R=1]/Sub[@Typ=26]";
  return GetValueByXPath(bookElement, string.Format(format, side));
}

private static string GetValueByXPath(XElement bookElement, string expression)
{
  XElement element = bookElement.XPathSelectElement(expression);
  return element != null ? element.Attribute("ID").Value : null;
}

And the calling code looks as below.

var result = doc.Descendants("Book")                            
                .Select(book => new
                {
                   ISBN = book.Attribute("ISBN").Value,
                   Genre = book.Attribute("Genre").Value,
                   Buyer_Company = GetCompanyValue(book, "2", "2"),
                   Buyer_Broker = GetBrokerValue(book, "2"),
                   Seller_Broker = GetBrokerValue(book, "1")
                })
                .ToArray();

Add a using statement to using System.Xml.XPath;

4 Comments

Good idea on Xpath. The thought crossed my mind while I was stringing together those .Element()'s!
@Sarathy Well, using group by and First is returning only one array element whereas I want to parse the entire XML. Please take a look at the edited question for different ISBN
@Sarathy Please remember that I would like to parse the enitre XML not just the first of each ISBN
Hi @Dev, I have edited the answer. Sorry for the late response.
1

Now that you've provided some examples, I think this will work for you.

const string xml =
    @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>    
    <Catalog>
    <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
    <Title  Side=""1"">
    <Pty R=""1"" ID=""Seller_ID"">
        <Sub Typ=""26"" ID=""John"">
        </Sub>
    </Pty>
    <Pty R=""2"" ID=""ABC"">
    </Pty>
        </Title>
    </Book>
    <Book ISBN=""1.2.1.1"" Genre=""Thriller"">
    <Title  Side=""2"">
    <Pty R=""1"" ID=""Buyer_ID"">
        <Sub Typ=""26"" ID=""Brook"">
        </Sub>
    </Pty>
    <Pty R=""2"" ID=""XYZ"">
    </Pty>
        </Title>
    </Book>
    </Catalog>";
var doc = XDocument.Parse(xml);

var results = new List<object>();
foreach (var book in doc.Descendants("Book")) {
    var title = book.Element("Title");
    string buyerCompany = null;
    string buyerBroker = null;
    string sellerCompany = null;
    string sellerBroker = null;
    if (title.Attribute("Side").Value == "1") {
        sellerCompany = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "2")
            .Select(pty => pty.Attribute("ID").Value)
            .FirstOrDefault();
        sellerBroker = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "1")
            .Select(pty => pty.Element("Sub").Attribute("ID").Value)
            .FirstOrDefault();
    } else if (title.Attribute("Side").Value == "2") {
        buyerCompany = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "2")
            .Select(pty => pty.Attribute("ID").Value)
            .FirstOrDefault();
        buyerBroker = title.Elements("Pty")
            .Where(pty => pty.Attribute("R").Value == "1")
            .Select(pty => pty.Element("Sub").Attribute("ID").Value)
            .FirstOrDefault();
    }

    var result = new {
        ISBN = book.Attribute("ISBN").Value,
        Genre = book.Attribute("Genre").Value,
        Seller_Company = sellerCompany,
        Seller_Broker = sellerBroker,
        Buyer_Company = buyerCompany,
        Buyer_Broker = buyerBroker,
    };

    results.Add(result);
}

Result:

result

Comments

0

I think maybe you want to group by ISBN and then selectively get values from the children.

const string xml = 
    @"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>    
    <Catalog>
        <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
            <Title  Side=""1"" MMY=""000"">
                <Pty R=""1"" ID=""Seller_ID"">
                    <Sub Typ=""26"" ID=""Seller_Broker"">
                    </Sub>
                </Pty>
                <Pty R=""2"" ID=""Seller_Company"">
                </Pty>
            </Title>
        </Book>
        <Book ISBN=""1.1.1.1"" Genre=""Thriller"">
            <Title  Side=""2"">
                <Pty R=""1"" ID=""Buyer_ID"">
                    <Sub Typ=""26"" ID=""Buyer_Broker"">
                    </Sub>
                </Pty>
                <Pty R=""2"" ID=""Buyer_Company"">
                </Pty>
            </Title>
        </Book>
    </Catalog>";
var doc = XDocument.Parse(xml);
var results = doc.Descendants("Book")
    .GroupBy(x => x.Attribute("ISBN").Value)
    .Select(x => new {
        ISBN = x.Key,
        Genre = x.First().Attribute("Genre").Value,
        PublishDate = x.First().Element("Title").Attribute("MMY").Value,
        BuyerId = x.Where(book => book.Element("Title").Attribute("Side").Value == "2")
            .First()
            .Element("Title")
            .Element("Pty")
            .Attribute("ID").Value
    })
    .ToArray();

Result:

{
    ISBN = "1.1.1.1",
    Genre = "Thriller",
    PublishDate = "000",
    BuyerId = "Buyer_ID"
}

7 Comments

Thanks but I want to capture both Buyer_Company, Seller_Company and Buyer_Broker, Seller_Broker for each <Book> element making one of them NULL
This sample shows you how to set a property using a Pty elements across Book elements filtered by the side attribute. If you want to go deeper, you can just continue. i.e., add .Element("Sub") after .Element("Pty") to access the Sub node.
Also, it might be helpful if you could post an example of the properties and expected values from the sample XML you've provided. I only went to Buyer_ID because it wasn't clear to me what you where expecting.
I have edited my question and added my expected results
and As you said I can continue deeper but that particular element may of may not exist in all the nodes
|
0

Try this for complete parsing

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;

namespace ConsoleApplication1
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.xml";
        static void Main(string[] args)
        {
            XDocument doc = XDocument.Load(FILENAME);
            var result = doc.Descendants("Book")
                .Select(b => new
                {
                    ISBN = b.Attribute("ISBN").Value,
                    Genre = b.Attribute("Genre").Value,
                    Side = b.Element("Title").Attribute("Side").Value,
                    ptr = b.Element("Title").Elements("Pty").Select(x => new {
                        R = x.Attribute("R").Value,
                        PtyID = x.Attribute("ID").Value,
                        Typ = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("Typ").Value).FirstOrDefault(),
                        SubIDTyp = x.Elements("Sub").Select(y => y == null ? null : y.Attribute("ID").Value).FirstOrDefault()
                    }).ToList()
                })
                .ToList();
        }
    }
}
​

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.