I have an xml/txt file like this:
<text id="32a45" language="ENG" date="2017-01-01" time="11:00" timezone="Eastern">
<s id="1">
foo
bar
</s>
<d>
11235
</d>
<text id="32a47" language="ENG" date="2017-01-05" time="1:00" timezone="Central">
<s id="2">
foo
bar
</s>
<d>
11235
</d>
<text id="32a48" language="ENG" date="2017-01-07" time="3:00" timezone="Pacific">
<s id="3">
foo
bar
</s>
<d>
11235
</d>
I want to build an sqlite table like the following using python:
id language date timezone s d
32a45 ENG 2017-01-01 Eastern foo bar 11235
32a47 ENG 2017-01-05 Central baz qux 11235
32a48 ENG 2017-01-07 Pacific foo bar 11235
Any idea how can I do this? I cannot use xmltree module because the xml tags in the original file is messed up. I would really appreciate the help. Thanks.
Edit: I can easily take each text as a list inside a list. Like this:
['<text id="32a45" language="ENG" date="2017-01-01" time="11:00" timezone="Eastern">', '<text id="32a47" language="ENG" date="2017-01-05" time="1:00" timezone="Central">', '<text id="32a48" language="ENG" date="2017-01-07" time="3:00" timezone="Pacific">']
But I don't know how to take the id, language etc. from each list separately.
xmlmodule, rife with example data and code.)