1

I am helping out migrate old tech docs from my old company to the new company. I need to remove the old company references that are like this: ABC Divisionname ProductName to ProductName.

And it can also be Divisionname ProductName to ProductName.

There are also the old names of the tech docs to the new names: techdoc to newdocname

I found some scripts that can do 1 at a time. I then found a glob script that do multiple files at once making 1 change.

I found some scripts that can do 1 xml file change at a time. I then found a glob script that do multiple files at once making 1 change.

import glob
import ntpath
import os

output_dir = "output"

if not os.path.exists(output_dir):
os.makedirs(output_dir)

for f in glob.glob("*.xml"):
    with open(f, 'r', encoding='utf-8') as inputfile:
        with open('%s/%s' % (output_dir, ntpath.basename(f)), 'w',       encoding='utf-8') as outputfile:
        for line in inputfile:
            outputfile.write(line.replace('OldCompanyName ProductName', 'ProductName'))

My goal is to change both of old product names to the new one. Is line.replace the best way to go? If so, can I do "ABC Divisionname ProductName" | "Divisionname", "ProductName" ?

1 Answer 1

1

You can use regular Expression substitute method [ re.sub ] Below is an example that may help.

import re

sample_xml_data = 'ABC Divisionname ProductName is the company name'

sample_xml_data_1 = 'Divisionname ProductName is the company name'

# Here is your pattern
old_company_name_pattern = re.compile('ABC Divisionname ProductName|Divisionname ProductName')

new_company_name = 'ProductName'

print(re.sub(old_company_name_pattern,new_company_name,sample_xml_data))
print(re.sub(old_company_name_pattern,new_company_name,sample_xml_data_1))

output :

ProductName is the company name

ProductName is the company

for your example , you can use like this

import re
import glob
import ntpath
import os

output_dir = "output"

if not os.path.exists(output_dir):
os.makedirs(output_dir)

old_company_name_pattern = re.compile('ABC Divisionname ProductName|Divisionname ProductName')
for f in glob.glob("*.xml"):
    with open(f, 'r', encoding='utf-8') as inputfile:
        with open('%s/%s' % (output_dir, ntpath.basename(f)), 'w',       encoding='utf-8') as outputfile:
        for line in inputfile:
            outputfile.write(re.sub(old_company_name_pattern,'ProductName',line))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.