1

I have a nested XML file I would like to be able to search and store some of the elements into an array.

I can't figure out how to do it....

The XML file looks something like this

<?xml version="1.0" encoding="ISO-8859-1"?>
<modeling>
 <generator>
  <i name="program" type="string">vasp </i>
  <i name="version" type="string">5.4.1  </i>
  <i name="subversion" type="string">24Jun15 (build Dec 17 2015 12:51:06) complex                          parallel </i>
  <i name="platform" type="string">IFC91_ompi </i>
  <i name="date" type="string">2016 04 09 </i>
  <i name="time" type="string">21:14:15 </i>
 </generator>
 <incar>
  <i type="string" name="SYSTEM"> LGPS</i>
  <i type="int" name="ISTART">     1</i>
  <i type="string" name="PREC">low (precision level)</i>
  <i type="string" name="ALGO"> Very Fast   (Elect. algorithm for MD)</i>
  <i type="logical" name="ADDGRID"> T  </i>
  <i type="int" name="ISPIN">     1</i>
  <i type="int" name="MAXMIX">    40</i>
  <i type="int" name="INIWAV">     1</i>
  <i type="int" name="NELM">    20</i>
  <i type="int" name="IBRION">    -1</i>
  <i name="EDIFF">      0.00005000</i>
  <i name="EDIFFG">     -0.01000000</i>
  <i type="int" name="NSW">     0</i>
  <i type="int" name="ISIF">     2</i>
  <i type="int" name="ISYM">     0</i>
  <i type="int" name="NBLOCK">    20</i>
  <i name="ENMAX">    400.00000000</i>
  <i name="POTIM">      2.00000000</i>
  <i name="TEBEG">   1500.00000000</i>
  <i name="TEEND">   1500.00000000</i>
  <i name="SMASS">      1.00000000</i>
  <i type="string" name="LREAL"> Auto      (Projection operators: automa</i>
  <v name="ROPT">     -0.00100000     -0.00100000     -0.00100000     -0.00100000</v>
  <i type="int" name="ISMEAR">     0</i>
  <i type="int" name="NWRITE">     2</i>
  <i type="logical" name="LCORR"> F  </i>
  <i type="int" name="LMAXMIX">     4</i>
  <i type="logical" name="LORBIT"> F  </i>
  <i type="logical" name="LASPH"> T  </i>
  <i type="int" name="ICORELEVEL">     1</i>

....... It eventually gets to this array, which I would like to extract from the XML file to a numpy array using elementtree.

<calculation>
     <varray name="forces" >
       <v>      -1.72025612       1.25780435       0.86117220 </v>
       <v>       0.28139069      -0.40806318       0.01136567 </v>
       <v>      -1.44336852       0.59811466       0.94797354 </v>
       <v>      -0.65405317      -0.20586426      -0.52529678 </v>
       <v>      -0.03741255       1.58074662       0.26915233 </v>
       <v>       1.38593804      -1.81797833       1.04331444 </v>
       <v>       0.06177848      -0.40373533       1.54406535 </v>
       <v>      -0.79191096      -0.11144535      -0.39247818 </v>
       <v>      -0.02452204       0.31078139      -0.44651888 </v>
       <v>      -2.26770774      -0.09736110       0.16802829 </v>
       <v>       0.09880357      -0.68257923       0.62500619 </v>
       <v>       0.00228472       0.62722458      -0.63621109 </v>
       <v>       1.01631395       0.06241868      -1.39029537 </v>
       <v>      -0.48273274       1.54873382      -1.95990958 </v>
       <v>       2.12761082      -1.17499249      -0.46258693 </v>
       </varray>
       <varray name="stress">
       <v>       1.01631395       0.06241868      -1.39029537 </v>
       <v>      -0.48273274       1.54873382      -1.95990958 </v>
       <v>       2.12761082      -1.17499249      -0.46258693 </v>
       </varray>
</calculation>

Any suggestions would be helpful.

Thanks!

1 Answer 1

1
import xml.etree.ElementTree as ET
import numpy as np

tree = ET.parse('input.xml')
root = tree.getroot()

a = np.array(
    [
        v.text.split()
        for v in root.findall(".//calculation/varray/v")
    ],
    dtype='float'
)

print a
Sign up to request clarification or add additional context in comments.

2 Comments

That's great - thanks. But actually there is more than one element that has the varray tag, which I hadn't noted in the original question. This pulls them both out. I've edited the XML snippet in the question to reflect this - so if you know how to pull only one of them out that would be awesome!
Yes, change the XPath expression passed to root.findall(). For example, root.findall('.//calculation/varray[@name="stress"]/v').

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.