0

I try to concatenate xml attributes, but it only takes the first pair, and then starts with the attributes from a new item. It will make sense when you read through the file.

import os, csv
from xml.etree import ElementTree


file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)

with open('output.csv', 'w', newline="") as f: 
    writer = csv.writer(f)
    writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])



for d in dom.findall('//item'):
    part = d.find('.//item-number').text
    name = d.find('.//name').text
    value = d.find('.//value').text 
    writer.writerow([part, '' , '', name + ":" + value])

Here is my xml file:

<?xml version="1.0"?>
<all>
<items>
<item>
<item-number>449</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Lemon</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Yellow</value>
</attribute>
</attributes>
</item>
<item>
<item-number>223</item-number>
<attributes>
<attribute>
<name>FRUIT</name>
<value>Orange</value>
</attribute>
<attribute>
<name>COLOR</name>
<value>Orange</value>
</attribute>
</attributes>
</item>
</items>
</all>

Here is what I get:

fruitNumber categoryNumber  Group   AttributeValueName
449                                 FRUIT:Lemon
223                                 FRUIT:Orange

Here is what I am trying to get:

fruitNumber categoryNumber  Group   AttributeValueName
449                                 FRUIT:Lemon│COLOR:Yellow
223                                 FRUIT:Orange│COLOR:Orange

Thanks for your help in advance!!!

4
  • 1
    Why do you use find() instead of findall() if you want multiple matches? Your code is explicitly asking for only the first name and first value of each item. Commented Jul 25, 2017 at 20:29
  • If you want a row of output per attribute element, perhaps you should search for those at the top level of your loop, instead of finding items first. Commented Jul 25, 2017 at 20:31
  • I use find() instead, because findall gives me an error message: list object has no attribute text Commented Jul 25, 2017 at 20:35
  • Yes, because you get a list of items, and need to retrieve the text from each one. You should be able to figure out how to iterate through a list and call item.text for each item in it without our help. Commented Jul 25, 2017 at 20:42

1 Answer 1

1

You're only reading the first attribute of each item. You need to additionally search the attributes under the item, collect them, then format them as you require when writing the row:

import os, csv
from xml.etree import ElementTree


file_name = 'data.xml'
full_file = os.path.abspath(os.path.join('xml', file_name))
dom = ElementTree.parse(full_file)

with open('output.csv', 'w', newline="") as f: 
    writer = csv.writer(f)
    writer.writerow(['fruitNumber', 'categoryNumber', 'Group', 'AttributeValueName'])

    for d in dom.findall('.//item'):
        part = d.find('.//item-number').text
        L = []
        for a in d.findall('.//attribute'):
            name = a.find('.//name').text
            value = a.find('.//value').text
            L.append('{}:{}'.format(name,value))
        writer.writerow([part, '' , '', '|'.join(L)])

Output

fruitNumber,categoryNumber,Group,AttributeValueName
449,,,FRUIT:Lemon|COLOR:Yellow
223,,,FRUIT:Orange|COLOR:Orange
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.