Am trying to search for a regex pattern in xml file content and finding issues on how to pass sub-string which always ends with digit (this is part which is dynamic in the xml file so, don't know how create a pattern and search).
Once pattern is found, then I need to get it's child tag items ie, attrib and text value.
xml file content :
<author NAME="PYTHON_DD101">
<type>BOOK</type>
<ID>59</ID>
<inst ID="A">Garry</inst>
<inst ID="B">Gerald</inst>
</author>
<author NAME="PYTHON_ABC4">
<type>BOOK</type>
<SrcID>62</SrcID>
<inst ID="A">Niel</inst>
<inst ID="B">Long</inst>
</author>
code :
text = "PYTHON"
tmp = '"' + text + "_ABC" + '"'
print(tmp)
#pattern = re.compile('%s\d+'%tmp)
endsWithNumber = re.compile('%s\d$'%tmp)
print(endsWithNumber)
#FoundDetails = Content.find("PYTHON_ABC4")
FoundDetails = Content.find(".//author[@NAME='{}']".format(endsWithNumber))
#regex = re.compile('%s\d+'%tmp)
#matches = regex.match(Content)
#print(matches)
print(type(Content))
print(type(FoundDetails))
print(FoundDetails)
for FoundDetails in FoundDetails.iterfind('author'):
author = FoundDetails.attrib['NAME']
print 'author:', author
for inst in FoundDetails.iterfind('inst'):
print 'inst id:', inst.attrib['ID'], 'inst name:', inst.text
error am getting :
PYTHON_ABC
<_sre.SRE_Pattern object at 0x000000000403F168>
<class 'xml.etree.ElementTree.Element'>
<type 'NoneType'>
None
Traceback (most recent call last):
File "C:\test_Book.py", line 45, in <module>
bookauthor = book.get_Book_by_author(Book)
File "C:\Book.py", line 219, in get_Book_by_author
for FoundDetails in FoundDetails.iterfind('author'):
AttributeError: 'NoneType' object has no attribute 'iterfind'
Expected output :
inst id: A inst name: Niel
inst id: B inst name: Long
if I pass exact NAME value ie, "PYTHON_ABC4" in the below line, it works but I don't want to pass hard-code value since there maybe other instance in the file there are chance of having name with same pattern ex :"PYTHON_ABC12" that case I wanted to get those book details as well.
FoundDetails = Content.find(".//author[@NAME='{}']".format("PYTHON_ABC4"))