How to replace an XML node via python

Question

I am new in python and I have a - maybe - stupid problem with XML files (yep, I've tried to google the solution but without results).

I have to write a program which will replace/switch two things, so first of all, here is the XML data, it looks like this:

<data='qwerty'>
    <name_it>some_name</name_it>
</data>

<next_data='next_qwerty'>
    <name_it>another_name</name_it>
</next_data>

<next_next_data>
...
</next_next_data>
<next_xyz_data>...
etc.

How in python i could change some_name to be in data=''? So it should be like this:

<data='some_name'>                            #changed from 'qwerty' to some_name
    <name_it>some_name</name_it>
</data>

<next_data='another_name'>                    #changed from 'next_qwerty' to another_name
    <name_it>another_name</name_it>
</next_data>

If it's a stupid question, sorry about that, but I truly googled it and I cannot find a solution.

UPDATE: Here is the few lines of python code that I wrote:

from xml_file import data

new=""

f = io.StringIO(data)  # data loading
for r in f: 
    row = r.rstrip() 
    if 'name_it' in row: 
        change = row[row.index('name_it')] # maybe kind of len() or something
    if "<data>" in row and change: 
        idx = row.index("<data>") + 6
        new += row[:idx] + change + "name_it=\n"
        change = ""  
    else:
        new += row + "\n" # new line

And here is true XML data:

<?xml version="1.0" encoding="UTF-8"?>
<testsuite name="Setup">
    <testcase classname="Configuration" name="xxx">
        <data>abc_qwe</data>                       #change_me_to_"xxx"
    </testcase>
    <testcase classname="Configuration" name="yyy">
        <data>xyzzzz</data>                        #change_me_to_"yyy"
    </testcase>
</testsuite>

There are many of the signs. Just <data>...</data> name should be in name="..."

Alright, so here are the content of files. First of all I am generating CSV file:

Type,Name,Request Count,Failure Count,Median Response Time,Average Response Time,Min Response Time,Max Response Time,Average Content Size,Requests/s,Failures/s,50%,66%,75%,80%,90%,95%,98%,99%,99.9%,99.99%,99.999%,100%
POST,---ON START---LOGIN,33,0,2023.709774017334,2037.008133801547,2023.709774017334,2058.631658554077,6587.515151515152,0.24352046353820625,0.0,2000,2000,2000,2000,2100,2100,2100,2100,2100,2100,2100,2100
GET,Aggregations,15,0,4,5.305735270182292,3.652334213256836,11.571884155273438,6174.2,0.11069111979009376,0.0,4,5,7,7,9,12,12,12,12,12,12,12
GET,Alarms,5,0,5,4.584074020385742,3.754138946533203,5.759000778198242,6173.8,0.03689703993003125,0.0,5,5,5,6,6,6,6,6,6,6,6,6
GET,Analysis Templates,16,0,7,7.806003093719482,3.8690567016601562,13.520479202270508,6174.625,0.11807052777610001,0.0,9,11,11,11,12,14,14,14,14,14,14,14
GET,Boiler Efficiency,15,0,6,6.464735666910808,3.6771297454833984,15.489578247070312,6174.2,0.11069111979009376,0.0,6,6,8,11,11,15,15,15,15,15,15,15
GET,Configuration,14,0,5,6.087354251316616,3.6630630493164062,12.647390365600586,6174.428571428572,0.1033117118040875,0.0,5,6,8,11,11,13,13,13,13,13,13,13

Then, I want to change it to be a XML:

import _csv
from locust_script import methods_count
with open('locust_stats.csv') as f, open('locus_statistics.csv', 'w') as out:
    for line in f:
        if not line.isspace():
            print(line.strip())
            out.write(line)

stats = open('locus_statistics.csv')
csv_f = _csv.reader(stats)
data = []
for row in csv_f:
    data.append(row)

def convert_row(row, methods):
    case_name = methods[0]
    del methods[0]

    return """
            <testcase classname="test_perf" name="%s">
                <Type>%s</Type>
                <Name>%s</Name>
                <Request_Count>%s</Request_Count>
                <Failure_Count>%s</Failure_Count>
                <Median_Response_Time>%s</Median_Response_Time>
            </testcase>""" % (case_name, row[0], row[1], row[2], row[3], row[4])
report_save = open('parsed.xml', 'w')
case_name = methods_count()
report_save.write("<testsuite name='performance'>"+''.join([convert_row(row, case_name) for row in data[1:1000]])+"</testsuite>")
report_save.close()

Finally, I want to have parsed XML, so as I wrote above, I have trying to use this kind of script:


from xml_file import data

new=""

f = io.StringIO(data)  # data loading
for r in f: 
    row = r.rstrip() 
    if 'name_it' in row: 
        change = row[row.index('name_it')] # maybe kind of len() or something
    if "<data>" in row and change: 
        idx = row.index("<data>") + 6
        new += row[:idx] + change + "name_it=\n"
        change = ""  
    else:
        new += row + "\n" # new line

So my intention is here - i think :) -:

            <testcase classname="test_perf" name="%s">
                <Type>%s</Type>
                <Name>%s</Name>

name="" should be the same as <Name> HERE </Name>

start by sharing 1) A valid xml document 2) python code that shows what you did so far — balderman
– balderman, Commented Aug 10, 2020 at 8:33
I think XML structure it's not necessarly in here. I've updated my question post with python code. — dorothy
– dorothy, Commented Aug 10, 2020 at 8:40
You can do it using XML parsing. If you will share a valid XML doc - I will be able to guide you. — balderman
– balderman, Commented Aug 10, 2020 at 8:49

balderman · Accepted Answer · 2020-08-10 12:43:07Z

1

Below:

import xml.etree.ElementTree as ET

xml = '''<testsuite name="Setup">
    <testcase classname="Configuration" name="xxx">
        <data>abc_qwe</data>                      
    </testcase>
    <testcase classname="Configuration" name="yyy">
        <data>xyzzzz</data>                       
    </testcase>
</testsuite>'''


root = ET.fromstring(xml)
test_cases = root.findall('.//testcase')
for test_case in test_cases:
    test_case.find('./data').text = test_case.attrib['name']
    
ET.dump(root)

output

<testsuite name="Setup">
    <testcase classname="Configuration" name="xxx">
        <data>xxx</data>                      
    </testcase>
    <testcase classname="Configuration" name="yyy">
        <data>yyy</data>                       
    </testcase>
</testsuite>

The other way (set the value of the name attribute with the text of data)

import xml.etree.ElementTree as ET

xml = '''<testsuite name="Setup">
    <testcase classname="Configuration" name="xxx">
        <data>data_1</data>                      
    </testcase>
    <testcase classname="Configuration" name="yyy">
        <data>data_2</data>                       
    </testcase>
</testsuite>'''


root = ET.fromstring(xml)
test_cases = root.findall('.//testcase')
for test_case in test_cases:
    test_case.attrib['name'] = test_case.find('./data').text
    
ET.dump(root)

edited Aug 10, 2020 at 12:43

answered Aug 10, 2020 at 9:02

balderman

24k8 gold badges39 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

dorothy Over a year ago

I see the point, but what if I have 1000 testcases? Maybe some iteration over them? Thank you for the solution.

balderman Over a year ago

As you can see, the code does not care if you have 2 testcases or 1000. If my solution helps - feel free to vote up.

dorothy Over a year ago

You are ofcourse right, but what's about this iteration in new_values list? If we will have 1000 values I need to add their names manually so there is no point here :/ Check this out: new_values=[1,2,3,4,5,1231231,avasvas,qweeqw,123123526354,34342342...n]

balderman Over a year ago

Only you (or the logic of your software) can tell the logic of replacing the values. Try to explain the logic of the replacement and maybe there is a smart way to do it.

dorothy Over a year ago

Updated one more time. Sorry for the mess

|

Collectives™ on Stack Overflow

How to replace an XML node via python

1 Answer 1

12 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

12 Comments

Your Answer

Sign up or log in

Post as a guest

Related