5

I have xml file like this :

<?xml version="1.0" encoding="UTF-8"?>
<Main xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns="http://cnig.gouv.fr/pcrs" gml:id="PlanCorpsRueSimplifie.1" version="2.0">
    <gml:boundedBy>
    </gml:boundedBy>
    <featureMember>
        <EmpriseEchangePCRS gml:id="EmpriseEchangePCRS.12189894">
            <datePublication>2020-05-13</datePublication>
            <type>Cellules</type>
            <geometrie>
                <gml:MultiSurface gml:id="EmpriseEchangePCRS.12189894-0" srsName="EPSG:3944" srsDimension="3">
                    <gml:surfaceMember>
                        <gml:Surface gml:id="EmpriseEchangePCRS.12189894-1">
                            <gml:patches>
                            </gml:patches>
                        </gml:Surface>

I wouldike to transform this file into json file. I tried this but I have always the same error :

import xmltodict
import xml.etree.ElementTree as ET

root = ET.fromstring(open('JeuxTestv2.gml').read())

print(xmltodict.parse(root)['Main'])

ERROR :

Traceback (most recent call last):
  File "C:\Users\xmltodict.py", line 6, in <module>
    print(xmltodict.parse(root)['Main'])
  File "C:\Users\xmltodict.py", line 327, in parse
    parser.Parse(xml_input, True)
TypeError: a bytes-like object is required, not 'xml.etree.ElementTree.Element'
2
  • I think you should feed the XML string (or the open file object) to xmltodict.parse directly, without the need to use ElementTree. E.g. print(xmltodict.parse(open('JeuxTestv2.gml'))). Commented Jun 12, 2020 at 22:28
  • @myrmica thank you it works ! Commented Jun 12, 2020 at 22:48

1 Answer 1

4

I am using Python 3.7.6

When I tried, ET.fromstring() will parse the XML that is already represented in string format.

import os
import xml.etree.ElementTree as et
xml_doc_path = os.path.abspath(r"C:\dir1\path\to\file\example.xml")
root = et.fromstring(xml_doc_path)
print(root)

this example will show the following ERROR

xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 2

I used ET.tostring() to generate a string representation of the XML data, which can be used as a valid argument for xmltodict.parse(). Click here for the ET.tostring() documentation.

The below code will parse an XML file and also generates the JSON file. I used my own XML example. Make sure all the XML tags are closed properly.

XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <element1 attribute1 = 'first attribute'>
    </element1>
    <element2 attribute1 = 'second attribute'>
        some data
    </element2>
</root>

PYTHON CODE:

import os
import xmltodict
import xml.etree.ElementTree as et
import json
xml_doc_path = os.path.abspath(r"C:\directory\path\to\file\example.xml")

xml_tree = et.parse(xml_doc_path)

root = xml_tree.getroot()
#set encoding to and method proper
to_string  = et.tostring(root, encoding='UTF-8', method='xml')

xml_to_dict = xmltodict.parse(to_string)

with open("json_data.json", "w",) as json_file:
    json.dump(xml_to_dict, json_file, indent = 2)

OUTPUT: The above code will create the following JSON file:

{
  "root": {
    "element1": {
      "@attribute1": "first attribute"
    },
    "element2": {
      "@attribute1": "second attribute",
      "#text": "some data"
    }
  }
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.