I apologise if this question could be easily answered by searching and reading the lxml documentation but I have tried to no avail.
I've been using lxml's findall quite frequently to query an XML file. Recently, I've needed to use wildcards in order to extract the data I need. This has led me to using Xpath.
I've managed to get this working with ETXPath but not Xpath. I'm confused as to why. An abstract of The XML file
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<DC xmlns="http://tradefinder.db.com/Schemas/MEL/MelHorizon_0_4_2.xsd">
<Header>
<FileName>DBL_MPA_Gap_PRD_2017-06-01T07-50-52.xml</FileName>
<ValidityDate>2017-05-31</ValidityDate>
<Version>0.42</Version>
<NoOfRecords>17228</NoOfRecords>
</Header>
<Overviews>
<OverviewLevelTimeStamp>
<Identifier>Z 1 Index, TRADE</Identifier>
<Level>2.2120000000000002</Level>
<Timestamp>09:00:00.000</Timestamp>
</OverviewLevelTimeStamp>
</Overviews>
</DC>
And my python code used to extract the
findshiz = ETXPath("//" + namespace + "DC/" + namespace + "Overviews/" + namespace + "OverviewLevelTimeStamp[" + namespace + "Identifier= 'Z 1 Index, TRADE']")
required_nodes = findshiz(gap_xml)
Where "gap_xml" = the parsing of the file.
This code works. For some reason when I try and use xpath it doesn't. This involves me just renaming ETXPath with xpath. The reason why is because I need to use wildcards, so instead of "Z 1 Index, TRADE", it would be Z 1 Index*.
Thanks and let me know anyways to improve the question.
namespace = ...ETXPathand the "normal"xpath(usingXPathinternally) is that the former expects namespaces denoted as{http://...}tagnamewhile the latter expects a prefixprefix:tagnameand an additional namespace map:{'prefix': 'http://..'}. But otherwise both should do the same. (See also lxml.de/1.3/xpathxslt.html#etxpath) Can you provide your complete code for both versions?