9

Is there a library or mechanism I can use to flatten the XML file?

Existing:

<A>
    <B>
        <ConnectionType>a</ConnectionType>
        <StartTime>00:00:00</StartTime>
        <EndTime>00:00:00</EndTime>
        <UseDataDictionary>N</UseDataDictionary>

Desired:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
1
  • 1
    I'd have a look at the xmltodict library in combination with this answer to flatten a dict. Commented Aug 9, 2016 at 14:05

3 Answers 3

9

By using xmltodict to transform your XML file to a dictionary, in combination with this answer to flatten a dict, this should be possible.

Example:

# Original code: https://codereview.stackexchange.com/a/21035
from collections import OrderedDict

def flatten_dict(d):
    def items():
        for key, value in d.items():
            if isinstance(value, dict):
                for subkey, subvalue in flatten_dict(value).items():
                    yield key + "." + subkey, subvalue
            else:
                yield key, value

    return OrderedDict(items())

import xmltodict

# Convert to dict
with open('test.xml', 'rb') as f:
    xml_content = xmltodict.parse(f)

# Flatten dict
flattened_xml = flatten_dict(xml_content)

# Print in desired format
for k,v in flattened_xml.items():
    print('{} = {}'.format(k,v))

Output:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
Sign up to request clarification or add additional context in comments.

1 Comment

for anyone like myself, that doesnt do much python, you can install the modules required to use this script by running pip install xmltodict and pip install OrderedDict
2

This is not a complete implementation but you could take advantage of lxmls's getpath:

xml = """<A>
            <B>
               <ConnectionType>a</ConnectionType>
               <StartTime>00:00:00</StartTime>
               <EndTime>00:00:00</EndTime>
               <UseDataDictionary>N
               <UseDataDictionary2>G</UseDataDictionary2>
               </UseDataDictionary>
            </B>
       </A>"""


from lxml import etree
from io import StringIO
tree = etree.parse(StringIO(xml))

root = tree.getroot().tag
for node in tree.iter():
    for child in node.getchildren():
         if child and child.text.strip():
            print("{}.{} = {}".format(root, ".".join(tree.getelementpath(child).split("/")), child.text.strip()))

Which gives you:

A.B.ConnectionType = a
A.B.StartTime = 00:00:00
A.B.EndTime = 00:00:00
A.B.UseDataDictionary = N
A.B.UseDataDictionary.UseDataDictionary2 = G

1 Comment

well this is a very nice solution but it's not working with the below XML string <?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="schemas.xmlsoap.org/soap/envelope" xmlns:xsi="w3.org/2001/XMLSchema-instance" xmlns:xsd="w3.org/2001/XMLSchema"><soap:Body><AddResponse xmlns="tempuri.org/"><AddResult>18</AddResult></…> ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
1

Here's an improved version from ƘɌỈsƬƠƑ that also handles nested lists:

def flatten_dict(d):
    def items():
        for key, value in d.items():
            # nested subtree
            if isinstance(value, dict):
                for subkey, subvalue in flatten_dict(value).items():
                    yield '{}.{}'.format(key, subkey), subvalue
            # nested list
            elif isinstance(value, list):
                for num, elem in enumerate(value):
                    for subkey, subvalue in flatten_dict(elem).items():
                        yield '{}.[{}].{}'.format(key, num, subkey), subvalue
            # everything else (only leafs should remain)
            else:
                yield key, value

1 Comment

I am new to python so can you add the working example. @alexeykorobov

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.