0

Im sorry have to ask again.

I want to convert xml file to excel by xml.etree.ElementTree.

Assume my xml looks like :

<ParameterCluster>
          <Name>AAAAAA</Name>
          <ParameterDefinitionList>
            <ParameterDefinition>
              <Name>LengthMin</Name>
              <Type>UInt8</Type>
            </ParameterDefinition>
            <ParameterDefinition>
              <Name>LengthMax</Name>
              <Type>UInt8</Type>
            </ParameterDefinition>
          </ParameterDefinitionList>

          <VariantImlementationList>
            <VariantImlementation>
              <MajorVariantList>
                <MajorVariant>A_Basis</MajorVariant>
              </MajorVariantList>
              <MinorVariantList>
                        <ParameterValue>
                          <ValueList>
                            <Value>47</Value>
                          </ValueList>
                          <ValueList>
                            <Value>80</Value>
                          </ValueList>
                        </ParameterValue>
              </MinorVariantList>
              <MajorVariantList>
                <MajorVariant>B_Basis</MajorVariant>
                <MajorVariant>C_Basis</MajorVariant>
              </MajorVariantList>
              <MinorVariantList>
                        <ParameterValue>
                          <ValueList>
                            <Value>47</Value>
                          </ValueList>
                          <ValueList>
                            <Value>40</Value>
                          </ValueList>
                        </ParameterValue>
              </MinorVariantList> 
            </VariantImlementation>
          </VariantImlementationList>
        </ParameterCluster>

That means, I have 3 basis (A_basis, B_basis, C_basis).

And in A_ Basis, the Value of LengthMin is 47 and Value of LengthMax is 80.

But in B_basis and C_Basis. the Value of LengthMin is 47 and Value of LengthMax is 40.

So I want to get something like :

{'AAAAAA','LengthMin','UInt8','A_Basis',47}
{'AAAAAA','LengthMax','UInt8','A_Basis',80}
{'AAAAAA','LengthMin','UInt8','B_Basis',47}
{'AAAAAA','LengthMax','UInt8','B_Basis',40}
{'AAAAAA','LengthMin','UInt8','C_Basis',47}
{'AAAAAA','LengthMax','UInt8','C_Basis',40}

Then I can write it into excel file. Is that possible to get that kind of list?

2
  • are you sure this is valid XML? E.g fist <MinorVariantList> isn't closed. Commented Jul 23, 2018 at 13:15
  • Hi Andrej Kesely. I'm sorry Today is my first day use stackoverflow. I've change it in my Question. Commented Jul 23, 2018 at 13:31

1 Answer 1

1

For parsing XML you can use BeautifulSoup instead of xml.etree.ElementTree (the interface is more intuitive).

The parsing is straightforward (assuming length of ParameterValue is always the same as ParameterValue.ValueList: Firstly you need to extract the the parameter types, and then iterate over all <MajorVariant> and populate result list.

If BeautifulSoup isn't a problem, here is example code:

data = """<ParameterCluster>
              <Name>AAAAAA</Name>
              <ParameterDefinitionList>
                <ParameterDefinition>
                  <Name>LengthMin</Name>
                  <Type>UInt8</Type>
                </ParameterDefinition>
                <ParameterDefinition>
                  <Name>LengthMax</Name>
                  <Type>UInt8</Type>
                </ParameterDefinition>
              </ParameterDefinitionList>

              <VariantImlementationList>
                <VariantImlementation>
                  <MajorVariantList>
                    <MajorVariant>A_Basis</MajorVariant>
                  </MajorVariantList>
                  <MinorVariantList>
                            <ParameterValue>
                              <ValueList>
                                <Value>47</Value>
                              </ValueList>
                              <ValueList>
                                <Value>80</Value>
                              </ValueList>
                            </ParameterValue>
                  </MinorVariantList>
                  <MajorVariantList>
                    <MajorVariant>B_Basis</MajorVariant>
                    <MajorVariant>C_Basis</MajorVariant>
                  </MajorVariantList>
                  <MinorVariantList>
                            <ParameterValue>
                              <ValueList>
                                <Value>47</Value>
                              </ValueList>
                              <ValueList>
                                <Value>40</Value>
                              </ValueList>
                            </ParameterValue>
                  </MinorVariantList>
                </VariantImlementation>
              </VariantImlementationList>
            </ParameterCluster>"""


from bs4 import BeautifulSoup
from pprint import pprint

soup = BeautifulSoup(data, 'xml')

name, types = soup.select_one('Name'), []
for n, t in zip(soup.select('ParameterDefinitionList Name'), soup.select('ParameterDefinitionList Type')):
    types.append([name.text, n.text, t.text])

rv = []
for major, minor in zip(soup.select('MajorVariantList'), soup.select('MajorVariantList ~ MinorVariantList')):
    for mj in major.select('MajorVariant'):
        for i, mn in enumerate(minor.select('Value')):
            rv.append(types[i] + [mj.text, mn.text])

pprint(rv, width=120)

Output:

[['AAAAAA', 'LengthMin', 'UInt8', 'A_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'A_Basis', '80'],
 ['AAAAAA', 'LengthMin', 'UInt8', 'B_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'B_Basis', '40'],
 ['AAAAAA', 'LengthMin', 'UInt8', 'C_Basis', '47'],
 ['AAAAAA', 'LengthMax', 'UInt8', 'C_Basis', '40']]
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you Andrej Kesely. I'm now install BeautifulSoup & pprint package .I'll try it in my script. It's awesome , thank you. Have a nice day.
Hi Andrej Kesely. I write my script as : infile = open("ALL.xml","r") contents = infile.read() soup = BeautifulSoup(contents,'xml') then this error gives out : UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2034240: character maps to <undefined> Im sorry Idont have any idea about this.. Do you ever face it before?
@syo that is encoding eroor, your file probably contains some strange characters. Look around StackOverflow how to properly decode/encode the string.
Hi Andrej Kesely . It worked out and looks really nice. Thank you & sorry for late feedback. Nice day.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.