I'm incredibly new at this, and I've tried searching but nothing I've found has been able to work for me.
I have xml data that looks like this
<datainfo>
<data>
<info State="1" Reason="x" Start="01/01/2016 00:00:00.000" End="01/01/2016 02:00:00.000"></info>
<info State="1" Reason="y" Start="01/01/2016 02:00:00.000" End="01/01/2016 02:01:00.000">
<moreinfo Start="01/01/2016 02:00:00.000" End="01/01/2016 02:00:30.000"/>
<moreinfo Start="01/01/2016 02:00:30.000" End="01/01/2016 02:01:00.000"/>
</info>
<info State="2" Start="01/01/2016 02:01:00.000" End="01/01/2016 02:10:00.000"></info>
...
</data>
</datainfo>
I want find how much time was spent in State {1,2,...} for reason {x,y,...} on a specific day and have that print to a .csv format to be latter read in excel.
The issue I'm having is I can't use static variables because there are hundreds of different states for hundreds of different reasons, and they change constantly.
If I'm not clear please tell me, I am brand new to this and really appreciate any and all help.
Edit: Here is what I currently have, hopefully this will clear up what I'm trying to do.
from datetime import datetime
from lxml import etree as ET
def parseXML(file):
handler = open(file, "r")
tree = ET.parse(handler)
info_list = tree.xpath('//info')
root = tree.getroot()
dictionary = {}
info_len = len(info_list)
for i in range(info_len):
info=root[0][0][i]
info_attribs = info.attrib
end = info_attribs[u'End']
start = info_attribs[u'Start']
FMT = '%m/%d/%Y %H:%M:%S.%f'
tdelta = datetime.strptime(end, FMT) - datetime.strptime(start, FMT)
t_dif = (tdelta.total_seconds()) / 60
try:
dictionary[info_attribs[u'State'] + status_attribs[u'Reason']] = t_dif
except:
continue
I'm trying to iterate through each line, find the State and the reason, then add them to a dictionary. If the entry already exists for that state and reason, I want to add it to the current value.
Let me know if I should provide more info!
Edit #2:
The output I'm looking for would be in the form of a .csv, stuctured like this:
State - Reason, [Total time spent in State 1 for x reason]