3

I have one question about transforming a text file to XML. I have done nice conversion of text file and it's look like:

Program: 5 Start: 2013-09-11 05:30:00 Duration 06:15:00 Title: INFOCANALE

And my output in XML will be like

<data>
  <eg>
    <program>Program 5</program>
    <start>2013-09-11 05:30:00</start>
    <duration>06:15:00</duration>
    <title>INFOCANALE</title>
  </eg>
</dat‌​a>

Can python convert text file to XML?
Can you help me with some advice, or some code.

5
  • My text looks like : Program: 5 Start: 2013-09-11 05:30:00 Duration 06:15:00 Title: INFOCANALE Commented Sep 11, 2013 at 11:07
  • And my output will be like: <data><eg><program>Program 5</program><start>2013-09-11 05:30:00</start><duration>06:15:00</duration><title>INFOCANALE</title></eg></data> Commented Sep 11, 2013 at 11:07
  • 1
    is your file format fixed or you can change it? at least you could put semicolon after each value so it would be easier to parse Commented Sep 11, 2013 at 11:23
  • Duplicate: stackoverflow.com/questions/17068536/… Commented Sep 11, 2013 at 11:25
  • I can change my text file, and I can rename it to whatever. Commented Sep 11, 2013 at 11:27

1 Answer 1

2

I think easiest way would be to change your file into csv file like this:

Program,Start,Duration,Title
5,2013-09-11 05:30:00,06:15:00,INFOCANALE

And then convert it like:

from lxml import etree
import csv

root = etree.Element('data')

rdr = csv.reader(open("your file name here"))
header = rdr.next()
for row in rdr:
    eg = etree.SubElement(root, 'eg')
    for h, v in zip(header, row):
        etree.SubElement(eg, h).text = v

f = open(r"C:\temp\data2.xml", "w")
f.write(etree.tostring(root))
f.close()

# you also can use
# etree.ElementTree(root).write(open(r"C:\temp\data2.xml", "w"))
Sign up to request clarification or add additional context in comments.

5 Comments

Traceback (most recent call last): File "./epg.py", line 53, in <module> etree.SubElement(eg, h).text = v File "lxml.etree.pyx", line 2659, in lxml.etree.SubElement (src/lxml/lxml.etree.c:53668) File "apihelpers.pxi", line 204, in lxml.etree._makeSubElement (src/lxml/lxml.etree.c:12230) File "apihelpers.pxi", line 1542, in lxml.etree._tagValidOrRaise (src/lxml/lxml.etree.c:23956) ValueError: Invalid tag name u' Program 10 '
@car have you converted your file into csv? could you put a few strings as an example?
No i get errors when i wanna convert it, but I'll try with my code for xml. I think it will work.
output = open('epg.xml','w') n = 0 print >> output, '<?xml version="1.0" encoding="utf-8" ?>'+'\t' print >> output, '<data>' with open('epg_slo_utf_xml.txt','r') as txt: for line in txt: if re.search('Program', line) !=None: n = n + 1 e = '<program name=SLO>'+line+'</program>' if re.search('Start', line) !=None: n = n + 1 f = '<start>'+line+'</start>' if re.search('Duration', line) !=None: n = n + 1 g = '<duration>'+line+'</duration>' wo = e + f + g print >> output, wo + w print >> output , '</data>'
@RomanPekar Will it consider the white spaces also? I'm using python 3.2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.