How to generate in memory data/table in python?

Question

working on XML, for which I will have to loop through and compare the values before or afterwords.

<TRANS DESCRIPTION ="" NAME ="EXPRR" >
            <FIELD EXPR ="A1" NAME ="SD" PORTTYPE ="INPUT/OUTPUT"/>
            <FIELD EXPR ="V" NAME ="DDS" PORTTYPE ="VARIABLE"/>
            <FIELD EXPR ="C" NAME ="SSS" PORTTYPE ="OUTPUT"/>
            <FIELD EXPR ="SD" NAME ="SS" PORTTYPE ="VARIABLE"/>
            <FIELD EXPR ="XX" NAME ="EEEE" PORTTYPE ="INPUT/OUTPUT"/>
</TRANS>

I would like to put this in the temp memory where I can look through the values and add a sequence. for ex.

seq key value

1 A1 SD
2 V DDS
3 C SSS
4 SD SSS
5 XX EEEE

Once I have this I will have to compare if value exists in the below rows. For example SD exists in below row. so on.

Is there any data structure I can use to perform this operation in Python 3 ?.

Nk03 · Accepted Answer · 2021-06-23 19:47:49Z

1

ONE WAY:

import xml.etree.ElementTree as ET
import xmltodict
import pandas as pd

tree = ET.parse('<your xml file path here>')
xml_data = tree.getroot()
# here you can change the encoding type to be able to set it to the one you need
xmlstr = ET.tostring(xml_data, encoding='utf-8', method='xml')

data_dict = dict(xmltodict.parse(xmlstr))
df = pd.DataFrame(data_dict['TRANS']['FIELD']).drop('@PORTTYPE', 1)
print(df)

OUTPUT:

  @EXPR @NAME
0    A1    SD
1     V   DDS
2     C   SSS
3    SD    SS
4    XX  EEEE

answered Jun 23, 2021 at 19:47

Nk03

15k2 gold badges11 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

sammywemmy · Accepted Answer · 2021-06-23 20:53:17Z

You could use collections.defaultdict to collate your data before creating a dataframe :

data = """<TRANS DESCRIPTION ="" NAME ="EXPRR" >
            <FIELD EXPR ="A1" NAME ="SD" PORTTYPE ="INPUT/OUTPUT"/>
            <FIELD EXPR ="V" NAME ="DDS" PORTTYPE ="VARIABLE"/>
            <FIELD EXPR ="C" NAME ="SSS" PORTTYPE ="OUTPUT"/>
            <FIELD EXPR ="SD" NAME ="SS" PORTTYPE ="VARIABLE"/>
            <FIELD EXPR ="XX" NAME ="EEEE" PORTTYPE ="INPUT/OUTPUT"/>
          </TRANS> 
       """

import xml.etree.ElementTree as ET root = ET.fromstring(data)

from collections import defaultdict


collection = defaultdict(list)

for child in root:
    collection['key'].append(child.attrib['EXPR'])
    collection['value'].append(child.attrib['NAME'])

pd.DataFrame(collection).rename_axis('seq')
 
    key value
seq          
0    A1    SD
1     V   DDS
2     C   SSS
3    SD    SS
4    XX  EEEE

Collectives™ on Stack Overflow

How to generate in memory data/table in python?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related