Parsing with ET:
import xml.etree.ElementTree as ET
strings = ['<text id="32a45" language="ENG" date="2017-01-01" time="11:00" timezone="Eastern">',
'<text id="32a47" language="ENG" date="2017-01-05" time="1:00" timezone="Central">',
'<text id="32a48" language="ENG" date="2017-01-07" time="3:00" timezone="Pacific">']
id_ = []
date = []
for string in strings:
tree = ET.fromstring(string+"</text>") #corrects wrong format
id_.append(tree.get("id"))
date.append(tree.get("date"))
print(id_) # ['32a45', '32a47', '32a48']
print(date) # ['2017-01-01', '2017-01-05', '2017-01-07']
Update, full compact example:
According to your original problem described here: How can I build an sqlite table from this xml/txt file using python?
import xml.etree.ElementTree as ET
import pandas as pd
strings = ['<text id="32a45" language="ENG" date="2017-01-01" time="11:00" timezone="Eastern">',
'<text id="32a47" language="ENG" date="2017-01-05" time="1:00" timezone="Central">',
'<text id="32a48" language="ENG" date="2017-01-07" time="3:00" timezone="Pacific">']
cols = ["id","language","date","time","timezone"]
data = [[ET.fromstring(string+"</text>").get(col) for col in cols] for string in strings]
df = pd.DataFrame(data,columns=cols)
id language date time timezone
0 32a45 ENG 2017-01-01 11:00 Eastern
1 32a47 ENG 2017-01-05 1:00 Central
2 32a48 ENG 2017-01-07 3:00 Pacific
Now you can use:
df.to_sql()
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html