I want to convert plain structured text files to the CSV format using Python.
The input looks like this
[-------- 1 -------]
Version: 2
Stream: 5
Account: A
[...]
[------- 2 --------]
Version: 3
Stream: 6
Account: B
[...]
The output is supposed to look like this:
Version; Stream; Account; [...]
2; 5; A; [...]
3; 6; B; [...]
I.e. the input is structured text records delimited by [----<sequence number>----] and containing <key>: <values>-pairs and the ouput should be CSV containing one record per line.
I am able to retrive the <key>: <values>-pairs into CSV format via
colonseperated = re.compile(' *(.+) *: *(.+) *')
fixedfields = re.compile('(\d{3} \w{7}) +(.*)')
-- but I have trouble to recognize beginning and end of the structured text records and with the re-writing as CSV line-records. Furthermore I would like to be able to separate different type of records, i.e. distinguish between - say - Version: 2 and Version: 3 type of records.