2

I want to replace all the strings (Except the image filename, only change those in the name tags) 'bicycle' in the xml file with 'bike'. I wanted to do with re.sub by using .readlines(), but that's not working. Can anyone advise how can I do that in the most efficient way (A good explanation will be of much help)?

<annotation>
    <folder>images</folder>
    <filename>bicycle (10).jpg</filename>
    <path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>960</width>
        <height>636</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>bicycle</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>68</xmin>
            <ymin>24</ymin>
            <xmax>755</xmax>
            <ymax>632</ymax>
        </bndbox>
    </object>
    <object>
        <name>bicycle</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1</xmin>
            <ymin>28</ymin>
            <xmax>189</xmax>
            <ymax>435</ymax>
        </bndbox>
    </object>
</annotation>
5
  • do you mean you want to replace text in xml or python? Commented Aug 12, 2021 at 20:17
  • 1
    I want to replace the string bicycle with bike in the xml file by reading it using python Commented Aug 12, 2021 at 20:19
  • Sorry for the confusion Do mean that you want to read it like a text file .read() and then replace all of the replace all of the words that contains bicycle with bike or like the image named bicycle.jpg you want to replace specific things <name> bicycle</name> with <name> bike</name> Commented Aug 12, 2021 at 20:30
  • 1
    I want to change those only in the name tags, except the filename Commented Aug 12, 2021 at 20:41
  • It is better to use XSLT for such tasks Commented Aug 12, 2021 at 23:58

3 Answers 3

2

If you want to replace ALL occurrences of "bicycle" it can be easily done with 'replace':

input_file = "example.xml"
output_file = "output.xml"
with open(input_file) as f:
    xml_content = f.readlines()
    
with open(output_file, 'w+') as f:
    for line in xml_content:
        f.write(line.replace('bicycle', 'bike'))

However, if you want to keep the structure of your xml intact (in case an element or attribute name would be bicycle) you might wanna take a look at elementTree or lxml.

Edit: after the edit of your question here a cleaner solution with elementTree:

import xml.etree.ElementTree as ET
input_file = "example.xml"
output_file = "output.xml"

tree = ET.parse(input_file)
root = tree.getroot()
name_elts = root.findall(".//name")    # we find all 'name' elements

for elt in name_elts:
    elt.text = elt.text.replace("bicycle", "bike")

tree.write(output_file)
Sign up to request clarification or add additional context in comments.

1 Comment

This solution with ET worked. Did it for a batch of files with a loop.
2

This was my approach, this will replace all instances of "bicycle" with "bike". This will also change "bicycle" in the path that you specified, which I think is what you were looking for. Also "text.xml" would need to be replaced with the name of the file you used

# Open file containing xml text and copy contents to string
f = open("test.xml", "r+")
xmlText = f.read()

# Bring pointer back to start of file and delete all contents
f.seek(0)
f.truncate()

# Replace all instances of bicycle with bike
newText = xmlText.replace("bicycle", "bike")

# Write this new text with replaced words to the file and close
f.write(newText)
f.close()

2 Comments

Yes, I actually overlooked the bicycle in the file name then, sorry
I also wanted to use this approach, but this approach doesn't keep the xml properties as much as I've learnt. I need to keep all the xml properties.
1

Please try the following XSLT based solution.

The XSLT is following a so called Identity Transform pattern.

It will modify <name> element values from 'bicycle' to 'bike', leaving everything else intact.

Input XML

<?xml version="1.0"?>
<annotation>
    <folder>images</folder>
    <filename>bicycle (10).jpg</filename>
    <path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>960</width>
        <height>636</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>bicycle</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>68</xmin>
            <ymin>24</ymin>
            <xmax>755</xmax>
            <ymax>632</ymax>
        </bndbox>
    </object>
    <object>
        <name>bicycle</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1</xmin>
            <ymin>28</ymin>
            <xmax>189</xmax>
            <ymax>435</ymax>
        </bndbox>
    </object>
</annotation>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="name[.='bicycle']">
        <xsl:copy>bike</xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Output XML

<annotation>
  <folder>images</folder>
  <filename>bicycle (10).jpg</filename>
  <path>C:\Users\Merida\Desktop\Bicycle\images\bicycle (10).jpg</path>
  <source>
    <database>Unknown</database>
  </source>
  <size>
    <width>960</width>
    <height>636</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>bike</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>68</xmin>
      <ymin>24</ymin>
      <xmax>755</xmax>
      <ymax>632</ymax>
    </bndbox>
  </object>
  <object>
    <name>bike</name>
    <pose>Unspecified</pose>
    <truncated>1</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>1</xmin>
      <ymin>28</ymin>
      <xmax>189</xmax>
      <ymax>435</ymax>
    </bndbox>
  </object>
</annotation>

2 Comments

is it possible to do for multiple files? Actually I want to make this change to many xml files and I was planning to do so with a for loop in python.
Sure. Just loop through the XML files in Python and apply XSLT transformation for each of them one by one

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.