2

I have a set of files that have the structure of an XML file (parent-child nodes), but are not the conventional XML files. The structure looks like this:

<_ML_Message>
  <TransactionId Value="0x02" />
  <GroupNo Value = "2" />
  <AbortOnError Value = "255" />
  <MessageBody>
   <GetProcParameterRequest>
     <ServerId Value="0xFFFFFFFFFFFF" />
     <ParameterTreePath Qty = "1" >
        <_OctetString Value="0x0000800029FF" />
     </ParameterTreePath>
   </GetProcParameterRequest>
  </MessageBody>
  <CRC16 Value = "0" />
  <EndOfMlMessage />
</_ML_Message>

<_ML_Message>
  <TransactionId Value="0x03" />
  <GroupNo Value = "3" />
  <AbortOnError Value = "255" />
  <MessageBody>
    <CloseRequest>
    </CloseRequest>
  </MessageBody>
  <CRC16 Value = "0" />
  <EndOfMlMessage />
</_ML_Message>

Since I cannot use the standard C# XML libraries (for example, XMLDocument) on this file I am trying to parse it and use it like a normal text file,

string baseDirectory = AppDomain.CurrentDomain.BaseDirectory;
string xml = File.ReadAllText(baseDirectory + "MyXMLFile.xml");
if (xml.StartsWith("TransactionId"))
{
  //Try to get the value
}

But parsing it this way is cumbersome at the moment, I was wonder if there are alternative ways to parse this kind of a file.

5
  • 3
    why can't you use standard XML libraries? What stops you from inserting the <! DOCTYPE...> you need to make it a valid XML file? Or is it invalid in other ways? (Doesn't follow the XML specification for comments, CData, Quotes, etc). Commented Nov 22, 2016 at 14:36
  • Regex is the way to go... But unless the restriction is serious, you should definitely use a parser library. Commented Nov 22, 2016 at 14:37
  • 3
    And if its the multiple root nodes just wrap it all in a <dummy></dummy>, the stuff you posted would parse fine. With XElement.Parse() you don't need a doctype. Commented Nov 22, 2016 at 14:38
  • 2
    @GeorgeStocker, the file he shared has multiple root nodes, as AlexK has noted. Commented Nov 22, 2016 at 14:39
  • @adv12 Yes; I know; my comment was basically getting across if you add the things necessary to make it a valid XML file; are there other reasons why it still wouldn't work? (Like it doesn't follow the XML specification re comments/data/quotes). Commented Nov 22, 2016 at 14:45

3 Answers 3

3

If I understood you correctly posiible solution is to add fake root element and parse new document with XMLDocument.

<root>
    <_ML_Message>
     ...
     </_ML_Message>
     <_ML_Message>
     ...
    </_ML_Message>
</root>
Sign up to request clarification or add additional context in comments.

Comments

2

If you have a file which contains a series of valid XML elements but no root element, wrap the file with a root element. You can then use the normal XML libraries to parse it.

Alternatively, break the stream up on the message boundaries which appear to be blank lines and parse each chunk. Either of these will be less work than trying to parse the elements yourself.

Comments

1

You can try this but if you want to get all transactionIds you need to read all

        string transactionId ;
        string rootStart = "<doc>";
        string rootEnd = "</doc>";
        string xml = rootStart + File.ReadAllText("test.txt") + rootEnd;
        XElement el = XElement.Parse(xml);
        var isExist = el.Descendants("TransactionId").Any();
        if (isExist) 
        {
           transactionId =  el.Descendants("TransactionId").FirstOrDefault().FirstAttribute.Value;
        }

3 Comments

Thanks for your answer! But also could you tell me how get the values from the other child nodes? For example _OctetString Value in the parameter treepath.
yes here is example : string transactionId; string rootStart = "<doc>"; string rootEnd = "</doc>"; string xml = rootStart + File.ReadAllText("test.txt") + rootEnd; XElement el = XElement.Parse(xml); var isExist = el.Descendants("TransactionId").Any(); if (isExist) { transactionId = el.Descendants("TransactionId").FirstOrDefault().FirstAttribute.Value; } var octetString = el.Descendants("_OctetString").FirstOrDefault().FirstAttribute.Value;
@agenthost Or add this line of code above example var octetString = el.Descendants("_OctetString").FirstOrDefault().FirstAttribute.Value;

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.