Environment: Python 2.6.5, Eclipse SDK 3.7.1, Pydev 2.3
I am trying to parse and change values in XML data in Python using xml.dom.minidom and I'm having an issue with blank text nodes.
When I parse an XML file into a DOM object and then convert it back to a string using toxml(), the closing "Description" tags get messed up after all the blank text nodes.
Does anyone know the problem is?
Contents of issue.py
from xml.dom import minidom
xml_dom_object = minidom.parse('news_shows.xml')
main_node = xml_dom_object.getElementsByTagName('NewsShows')[0]
xml_string = main_node.toxml()
print xml_string
Contents of news_shows.xml (notice the two blank Text nodes):
<NewsShows Planet="Earth" Language="English" Year="2012">
<NewsShow ShowName="The_Young_Turks">
<Description Detail="Best_show_of_all_time_according_to_many">True</Description>
<Description Detail="The_only_source_of_truth"></Description>
<Description Detail="Three_hours_of_truth_per_day">True</Description>
</NewsShow>
<NewsShow ShowName="The_Rachel_Maddow_Show">
<Description Detail="Pretty_great_as_well">True</Description>
<Description Detail="Sucks_badly">False</Description>
<Description Detail="Conveys_more_information_than_TYT"></Description>
</NewsShow>
</NewsShows>
Output of the script (notice the 2 "Description" tags that are messed up):
<NewsShows Language="English" Planet="Earth" Year="2012">
<NewsShow ShowName="The_Young_Turks">
<Description Detail="Best_show_of_all_time_according_to_many">True</Description>
<Description Detail="The_only_source_of_truth"/>
<Description Detail="Three_hours_of_truth_per_day">True</Description>
</NewsShow>
<NewsShow ShowName="The_Rachel_Maddow_Show">
<Description Detail="Pretty_great_as_well">True</Description>
<Description Detail="Sucks_badly">False</Description>
<Description Detail="Conveys_more_information_than_TYT"/>
</NewsShow>