Python convert CSV to XML with pandas groupby

Question

I have a csv that I need to convert to XML using Python. I'm a novice python dev.

Example CSV data:

Amount,Code
CODE50,1246
CODE50,6290
CODE25,1077
CODE25,9790
CODE100,5319
CODE100,4988

Necessary output XML

<coupon-codes coupon-id="CODE50">
    <code>1246</code>
    <code>1246</code>
<coupon-codes/>
<coupon-codes coupon-id="CODE25">
    <code>1077</code>
    <code>9790</code>
<coupon-codes/>
<coupon-codes coupon-id="CODE100">
    <code>5319</code>
    <code>4988</code>
<coupon-codes/>

My guess is I have to use pandas to pull the csv in, use pandas.groupby to group the Amount column, then push this into an element/subelement to create the xml, then print/push to a xml file. I can't get the groupby to work and don't know how to then push that into the element, then populate the sub element.

1) ElementTree is giving me trouble - please explain 2) What is the logic of the csv to xml conversion - please explain. — balderman
– balderman, Commented Nov 15, 2021 at 18:59
@balderman unsure I understand the question, I mean its giving me trouble in that I can't get it to work, using the scraps of different code i've pulled off the web. I've had success with direct csv to xml wherein each row is an element, but I can't figure out how to group by a column, and have that spit out the appropriate xml. — Jeremy Altman
– Jeremy Altman, Commented Nov 15, 2021 at 19:07
- no need for any external lib. ElementTree is good enough here. See my answer. — balderman
– balderman, Commented Nov 15, 2021 at 19:22

ListenSoftware Louise Ai Agent · Accepted Answer · 2021-11-15 22:33:01Z

1

why not use df.to_xml()? This simple api will generate xml from the dataframe

answered Nov 15, 2021 at 22:33

ListenSoftware Louise Ai Agent

4,3432 gold badges31 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

balderman · Accepted Answer · 2021-11-15 19:48:05Z

0

ElementTree can do it without a trouble. See below

from collections import defaultdict
from xml.etree.ElementTree import Element, SubElement,ElementTree

data = defaultdict(list)

with open('in.txt') as f:
  next(f)
  for line in f:
    parts = line.split(',')
    data[parts[0]].append(parts[1].strip())

root = Element('root')
for k,v in data.items():
  sub = SubElement(root,'coupon-codes',attrib={'coupon-id':k})
  for vv in v:
    sub_v = SubElement(sub,'code')
    sub_v.text = vv

tree = ElementTree(root)
with open('out.xml', 'w') as f:
  tree.write(f, encoding='unicode')

edited Nov 15, 2021 at 19:48

answered Nov 15, 2021 at 19:14

balderman

24k8 gold badges39 silver badges60 bronze badges

3 Comments

Jeremy Altman Over a year ago

this output is perfect! How do I then write that to a file? I'm trying just root.write('test.xml') but am getting an attributeError 'xml.etree.ElementTree.Element' object has no attribute 'write'

balderman Over a year ago

Code was updated - the xml document is written to a file.

Jeremy Altman Over a year ago

you're the best. Thanks!

score 0 · Accepted Answer · 2021-11-15 19:16:20Z

import xml.etree.ElementTree as ET
import pandas as pd

df = pd.read_csv('path/to/csvfile.csv')

root = ET.Element('root')
for coupon_id, df_group in df.groupby('Amount'):
    coupon_codes = ET.Element('coupon-codes', {'coupon-id': coupon_id})
    for coupon_code in df_group['Code']:
        code = ET.Element('code')
        code.text = str(coupon_code)
        coupon_codes.append(code)
    root.append(coupon_codes)

To convert it to a minified string, use this:

min_xml = ET.tostring(root, encoding='utf8')
print(min_xml)

Output:

<root><coupon-codes coupon-id="CODE100"><code>5319</code><code>4988</code></coupon-codes><coupon-codes coupon-id="CODE25"><code>1077</code><code>9790</code></coupon-codes><coupon-codes coupon-id="CODE50"><code>1246</code><code>6290</code></coupon-codes></root>

To convert it to a formatted string, use this:

ET.indent(root)
pretty_xml = ET.tostring(root, encoding='utf8')
print(pretty_xml)

Output:

<root>
  <coupon-codes coupon-id="CODE100">
    <code>5319</code>
    <code>4988</code>
  </coupon-codes>
  <coupon-codes coupon-id="CODE25">
    <code>1077</code>
    <code>9790</code>
  </coupon-codes>
  <coupon-codes coupon-id="CODE50">
    <code>1246</code>
    <code>6290</code>
  </coupon-codes>
</root>

Javier Huerta · Accepted Answer · 2021-11-15 19:52:30Z

0

I can't comment yet, but is better if you put the code that you used for trying to solve the problem.

But the way there is a very similar question here, simple-csv-to-xml-conversion-python

And here is a post for grouping by, group by/cumcount

With this information i achieved it, by doing this

df = pd.DataFrame(['CODE50',
    'CODE50',
    'CODE25',
    'CODE25',
    'CODE100',
    'CODE100'],columns=['code'])

df['amount'] = [1246,
    6290,
    1077,
    9790,
    5319,
    4988]

df['columns'] = df.groupby('code')['amount'].cumcount()
result = df.pivot(index='code', columns='columns')

def convert(df):
    str = ''
    for a in df.index:
        str += '\n'f"""<coupon-codes coupon-id="{a}">"""
        for b in df['amount'].loc[a]:
            str += '\n''       'f"<code>{b}</code>"
    return str
xml = convert(result)
print(xml)

result

<coupon-codes coupon-id="CODE100">
       <code>5319</code>
       <code>4988</code>
<coupon-codes coupon-id="CODE25">
       <code>1077</code>
       <code>9790</code>
<coupon-codes coupon-id="CODE50">
       <code>1246</code>
       <code>6290</code>

answered Nov 15, 2021 at 19:52

Javier Huerta

12812 bronze badges

1 Comment

Javier Huerta Over a year ago

@user17242583 The output is the same, as the only difference is that is not sorted. I will check your answer as i haven't used xml.etree and change your output because the question has changed

Collectives™ on Stack Overflow

Python convert CSV to XML with pandas groupby

4 Answers 4

Comments

3 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related