I have a DataFrame on the following format which I want to transform to XML
Parameter Name | Value | Comment
lev1.lev12 5 "Comment 1"
lev1.lev13.lev14 10 "Comment 2"
lev2.lev22 "hi" "Comment 3"
lev2.lev23 NaN "No need to set value"
The levels of the XML structure is defined in the Parameter Name, where each level is separated by ".". A comment should be written as a separate line before the actual key-value pair. If the value is NaN, then the comment and the empty value should be written as comment.
So the wanted output here would be
<lev1>
<!-- Comment 1 -->
<lev12>5</lev12>
<lev13>
<!-- Comment 2 -->
<lev14>10</lev14>
</lev13>
</lev1>
<lev2>
<!-- Comment 3 -->
<lev22> "hi" </lev22>
<!-- No need to set value -->
<!-- <lev23></lev23> -->
</lev2>
I have written the initial function that will make it possible to iterate through the DataFrame, but don't fully understand how to use ElementTree or lxml to create the XML structure.
def df_to_xml(row,etree):
param = row['Parameter Name']
val = row['Value']
comment = row['Comment']
param_levels = param.split(".")
for level in param_levels[:-1]:
## With each level iterate down the tree structure.
## At the lowest level, add the comment and then the value
tree = ET.ElementTree()
df.apply(lambda x: df_to_xml(x,tree),axis=1)
# Write tree to xml.
How would I go about traversing the tree to the right level and adding the comment and value in the for loop?
Appreciate any tips or input.
