2

(Edit: the script seems to work for others here trying to help. Is it because I'm running python 2.7? I'm really at a loss...)

I have a raw text file of a book I am trying to tag with pages.

Say the text file is:

some words on this line,
1
DOCUMENT TITLE some more words here too.
2
DOCUMENT TITLE and finally still more words.

I am trying to use python to modify the example text to read:

some words on this line,
</pg>
<pg n=2>some more words here too,
</pg>
<pg n=3>and finally still more words.

My strategy is to load the text file as a string. Build search-for and a replace-with strings corresponding to a list of numbers. Replace all instances in string, and write to a new file.

Here is the code I've written:

from sys import argv
script, input, output = argv

textin = open(input,'r')
bookstring = textin.read()
textin.close()

pages = []
x = 1
while x<400:
    pages.append(x)
    x = x + 1

pagedel = "DOCUMENT TITLE"

for i in pages:
    pgdel = "%d\n%s" % (i, pagedel)
    nplus = i + 1
    htmlpg = "</p>\n<p n=%d>" % nplus
    bookstring = bookstring.replace(pgdel, htmlpg)

textout = open(output, 'w')
textout.write(bookstring)
textout.close()

print "Updates to %s printed to %s" % (input, output)

The script runs without error, but it also makes no changes whatsoever to the input text. It simply reprints it character for character.

Does my mistake have to do with the hard return? \n? Any help greatly appreciated.

6
  • /Edited to include correction to the bookstring replace command, but still the problem persists. Commented Jul 4, 2013 at 3:11
  • hmm... if I run that script, it does write the changes in the output file. What exactly are you trying to do? I mean, it is working for me. Commented Jul 4, 2013 at 3:13
  • also, it should be textin.close(), otherwise you're not calling the function. The same for textout.close. Commented Jul 4, 2013 at 3:16
  • Thanks, now reflected in question. It still is not working for me. I'm using .txt files on a mac as input and output files. I tried the test example from my question, and it still simply copies the input text without edits to the output. Commented Jul 4, 2013 at 3:26
  • Try adding print bookstring to see. It's working for me, are you sure it isn't a problem with the given arguments? Commented Jul 4, 2013 at 3:34

2 Answers 2

4

In python, strings are immutable, and thus replace returns the replaced output instead of replacing the string in place.

You must do:

bookstring = bookstring.replace(pgdel, htmlpg)

You've also forgot to call the function close(). See how you have textin.close? You have to call it with parentheses, like open:

textin.close()

Your code works for me, but I might just add some more tips:

  • Input is a built-in function, so perhaps try renaming that. Although it works normally, it might not for you.

  • When running the script, don't forget to put the .txt ending:

    • $ python myscript.py file1.txt file2.txt
  • Make sure when testing your script to clear the contents of file2.

I hope these help!

Sign up to request clarification or add additional context in comments.

4 Comments

That's a crucial error to fix, but still the same problem persists. Going to edit my question to include this edit. Thanks!
Called the closed functions, and edited my question to reflect as much. It's still not working.
@user1893148 I have added more info
Thanks for your help. Crazy that no one can reproduce it.
0

Here's an entirely different approach that uses re(import the re module for this to work):

doctitle = False
newstr = ''
page = 1

for line in bookstring.splitlines():
    res = re.match('^\\d+', line)
    if doctitle:
        newstr += '<pg n=' + str(page) + '>' + re.sub('^DOCUMENT TITLE ', '', line)
        doctitle = False
 elif res:
     doctitle = True
     page += 1
    newstr += '\n</pg>\n'
 else:
    newstr += line

print newstr

Since no one knows what's going on, it's worth a try.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.