0

I've got large XML file that I need to set some values in through Matlab's xmlread. Unfortunately, they have identical tags and structure, so I can reach only the first instance. Specifically, I need to change the values of every <min> and <max> tags separately.

The strucuture of xml file is as follows:

<?xml version="1.0" encoding="utf-8"?>
<GTOMonteCarlo version="3.0.0">
   <STELAVersion>3.0</STELAVersion>
      <STELAVersion>3.0</STELAVersion>
   <GTOInputParameters>
      <AbstractInputParameters>
         <MeanDeltaUniformParameters>
            <UniformParameters>
               <min>2.5</min>
               <max>2.7</max>
            </UniformParameters>
         </MeanDeltaUniformParameters>
         <MeanDeltaUniformParameters>
            <UniformParameters>
               <min>0.0217</min>
               <max>0.0317</max>
            </UniformParameters>
         </MeanDeltaUniformParameters>
         <MeanDeltaUniformParameters>
            <UniformParameters>
               <min>1.2</min>
               <max>1.8</max>
            </UniformParameters>
         </MeanDeltaUniformParameters>
      </AbstractInputParameters>
      <MeanDeltaUniformParameters>
         <UniformParameters>
            <min>0.0217</min>
            <max>0.0317</max>
         </UniformParameters>
      </MeanDeltaUniformParameters>
      <MinMaxUniformParameters>
         <UniformParameters>
            <min>0.8</min>
            <max>1.2</max>
         </UniformParameters>
      </MinMaxUniformParameters>
(...)
</GTOMonteCarlo>

The Matlab code that I'm using now works only for the first instance of the tag.

xDoc = xmlread(fullfile(filename));
    set_value(xDoc, 'min', 2.5);
    set_value(xDoc, 'max', 2.7;
(...)
xmlwrite(output_name,xDoc);
1
  • You need to traverse the DOM structure properly, moving up and down to child/parent nodes and from one sibling node to the next. Commented Jul 26, 2016 at 22:31

1 Answer 1

1

I'll expand a bit on what @nirvana-msu meant, without coding the solution for you as that would be very tedious.

>> xDoc = xmlread(fullfile(filename)); 
>> xRoot = xDoc.getDocumentElement() 
xRoot =
    [GTOMonteCarlo: null]

xRoot here is your starting point, i.e. the root node of the document.

The root node, like any node, has children. From your xml file you can see that the direct children of GTOMonteCarlo are STELAversion, GTOInputParameters, etc. Almost. The empty 'text' nodes contained between those tags are also valid nodes. Once you've accessed a node, play with the following commands, until you figure out what's going on.

>> RootChildren = xRoot.getChildNodes()
RootChildren =
    [GTOMonteCarlo: null]    % not very informative, I agree.
                             % but trust me that this is now a list of nodes

>> RootChildren.getLength  % How many DOM child elements does this list contain?
ans =
     7        % 7! Great. Let's access them and inspect them

>> RootChildren.item(1)  % get the first one
ans =
    [STELAVersion: null]   % hm ... that wasn't it. What's the next one?

>> RootChildren.item(2)  
ans =
 [#text: ]     % that's not it either. 

At this point you should realise that the node you need to descend into is at position 5. I.e. in our tree we have one STELLAVersion node, an empty text node, another STELLAVersion node (which for some reason is incorrectly indented, but anyway), another empty text node, and THEN we get to the node you're interested in. So it's child 5 on the list.

>> RootChildren.item(5)  
ans =
    [GTOInputParameters: null]      % bingo! Let's get this guy's children

>> GTOChildren = RootChildren.item(5).getChildNodes() 
GTOChildren =
    [GTOInputParameters: null]

>> GTOChildren.getLength                               
ans =
     7

>> GTOChildren.item(1)  
ans =
    [AbstractInputParameters: null]

etc etc.

You're supposed to find a way to traverse the nodes and their children, until you get to the nodes that interest you, i.e. the 'min' and 'max' nodes.

Unfortunately these xml functions are largely undocumented ... but if you press tab once or twice after you type a node variable and a dot (i.e. "RootChildren.[TAB][TAB]" ) you will get a popup list of all the functions available for that type of node (or list of nodes), and you can play around with them and see what they do; most have fairly self-explanatory names.

One of those for instance, is .getTextContent. You will need that to get your numerical value once you've got the 'min' node in your hands, i.e.

MinNode.getTextContent

If that's the end of your mission, great! But if in fact you're trying to create an xml file with updated values as I understood you're trying to do, then read on.

I've spotted a .setTextContent which presumably is what you need if you're after setting a value instead.

Unfortunately, your nodes don't update their content and that's it. Any node object you have is simply an independent variable in memory. So when you update a node, you then have to call ParentNode.replaceChild(NewChildNode, OldChildNode), and make all those child substitutions all the way back into your Root node, so that you end up with a shiny new Root node to add to your xml Document.

Fortunately, while things like replaceChild are undocumented in matlab, presumably they directly correspond to existing javascript DOM functions, which you can find here: http://www.w3schools.com/jsref/dom_obj_all.asp (for example, the replaceChild function is here: http://www.w3schools.com/jsref/met_node_replacechild.asp)

Once you have your updated Root node, you replace your xDoc root node with that one, and then you can use xmlwrite with your new xDoc, to write to a new xml file.

See the xmlwrite help to see how to use that, as well as another simple example of DOM manipulation.

Hope this gets you started. Good luck.

Sign up to request clarification or add additional context in comments.

7 Comments

small caveat. NodeLists are java lists, therefore 0-indexed. I.e. the first element is actually .item(0) and not .item(1) as I implied in the code above. (item(0) was just another invisible 'empty text' node before the 'STELAVersion' node). Just mentioning in case it catches you out.
Thanks! This is really helpful. I will try to use this method today :)
Good luck! If it does indeed solve your problem please come back and throw some good vibes my way, and mark it as accepted :)
Hi, it worked on my large file and I was able to save it with new data via xmlwrite and changed content of tag with .setTextcontent. Unfortunately it's pretty laborious to find each of the tags that I need to change, so at the moment I'm trying to figure some function to change what I need.
Yes, I agree, working the DOM tree this way can get tedious. Your best bet is to make nice top-down functions that do what you want per step. Alternatively, I do wonder if it would be easier to solve your problem with a simple regex substitution! If everything you want to change is always contained within min/max tags in the exact same way, and your xml file is guaranteed to not contain 'other' min/max tags, then a simple regex substitution is going to be a lot simpler (i.e. a one-liner).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.