I am parsing a large XML file, which essentially contains a table. The nodes in the XML don't always have names. Nested deep within several tags is what is basically an HTML-like table with <TD>s containing raw (numeric) data within row ( <TR> ) tags. Now before I can iterate through to the table there is a whole bunch of metadata tags that I'm not interested in. For instance:
<?xml version="1.0" ?>
<soap:Envelope xmlns:soap="--ommitted--" xmlns:xsi="--ommitted--">
<soap:Body>
<FetchReportResponse xmlns="URL1">
<FetchReportResult xmlns="URL2">
<REPORT>
<TITLE>CROSS VISITING REPORT</TITLE>
<SUBTITLE/>
<SUMMARY>
<GEOGRAPHY>--ommitted--</GEOGRAPHY>
<LOCATION>--ommitted--</LOCATION>
<TIMEPERIOD>--ommitted--</TIMEPERIOD>
<TARGET>--ommitted--</TARGET>
<MEDIA>--ommitted--</MEDIA>
<DATE>--ommitted--</DATE>
<USER>--ommitted--</USER>
</SUMMARY>
<TABLE>
<THEAD>
<TR>
<TH>--ommitted--</TH>
<TD>--ommitted--</TD>
<TD>--ommitted--</TD>
<TD>--ommitted--</TD>
<TD>--ommitted--</TD>
<TD>--ommitted--</TD>
<TD>--ommitted--</TD>
I am new to XML parsing so I'm following this. I have the following code to read and XML file and create an ElementTree object.
import xml.etree.ElementTree as ET
tree = ET.parse('./../filename.xml')
print(root.find("./"))
This understandably prints the following:
<Element '{http://schemas.xmlsoap.org/soap/envelope/}Envelope' at 0x00000230CAC23318>
However, when I try to use the XPath convention to traverse it from here on, I'm unable to. For instance,
print(root.find("./Body"))
prints None, even though <Body> is clearly nested inside <Envelope>.
EDIT: Following Mark Tolonen's answer I was able to get to the Body tag, but how do I get beyond that? More specifically, I want to reach the <TABLE> tag.