Concatenate values from XML file in Python

Question

I try to concatenate xml attributes, but it only takes the first pair, and then starts with the attributes from a new item. It will make sense when you read through the file.

import os, csv
from xml.etree import ElementTree


file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)

with open('output.csv', 'w', newline="") as f: 
    writer = csv.writer(f)
    writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])



for d in dom.findall('//item'):
    part = d.find('.//item-number').text
    name = d.find('.//name').text
    value = d.find('.//value').text 
    writer.writerow([part, '' , '', name + ":" + value])

Here is my xml file:

<?xml version="1.0"?>
<all>
<items>
<item>
<item-number>449</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Lemon</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Yellow</value>
</attribute>
</attributes>
</item>
<item>
<item-number>223</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Orange</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Orange</value>
</attribute>
</attributes>
</item>
</items>
</all>

Here is what I get:

fruitNumber categoryNumber  Group   AttributeValueName
449                                 FRUIT:Lemon
223                                 FRUIT:Orange

Here is what I am trying to get:

fruitNumber categoryNumber  Group   AttributeValueName
449                                 FRUIT:Lemon│COLOR:Yellow
223                                 FRUIT:Orange│COLOR:Orange

Thanks for your help in advance!!!

Why do you use find() instead of findall() if you want multiple matches? Your code is explicitly asking for only the first name and first value of each item. — Charles Duffy
– Charles Duffy, Commented Jul 25, 2017 at 20:29
If you want a row of output per attribute element, perhaps you should search for those at the top level of your loop, instead of finding items first. — Charles Duffy
– Charles Duffy, Commented Jul 25, 2017 at 20:31
I use find() instead, because findall gives me an error message: list object has no attribute text — Alexander
– Alexander, Commented Jul 25, 2017 at 20:35
Yes, because you get a list of items, and need to retrieve the text from each one. You should be able to figure out how to iterate through a list and call item.text for each item in it without our help. — Charles Duffy
– Charles Duffy, Commented Jul 25, 2017 at 20:42

Mark Tolonen · Accepted Answer · 2017-07-25 20:47:38Z

You're only reading the first attribute of each item. You need to additionally search the attributes under the item, collect them, then format them as you require when writing the row:

import os, csv
from xml.etree import ElementTree


file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)

with open('output.csv', 'w', newline="") as f: 
    writer = csv.writer(f)
    writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])

    for d in dom.findall('.//item'):
        part = d.find('.//item-number').text
        L = []
        for a in d.findall('.//attribute'):
            name = a.find('.//name').text
            value = a.find('.//value').text
            L.append('{}:{}'.format(name,value))
        writer.writerow([part, '' , '', '|'.join(L)])

Output

fruitNumber,categoryNumber,Group,AttributeValueName
449,,,FRUIT:Lemon|COLOR:Yellow
223,,,FRUIT:Orange|COLOR:Orange

Collectives™ on Stack Overflow

Concatenate values from XML file in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related