2

To start off, because I've been burned before by someone with a power trip, this question is not for homework.

Anyway, I have a text file that is something like the following:

####
# File section 1
####

1.0   abc   Description1
6.5   def   Description2
1.0 2.0 3.0   ghi   Description3
11    jkl   Description

####
# File section 2
####

1.0   abc   Description1
12.5   def   Description2
1.0 2.0 3.0   ghi   Description3
11    jkl   Description

#### End file

I would like to replace the string "1.0" in the 2 lines:

1.0   abc   Description1

However, NOT the "1.0" string in the lines:

1.0 2.0 3.0   ghi   Description3

The current code that I'm using is:

with open('sample_file.txt','r') as file:
    filedata = file.read()
    filedata = filedata.replace('1.0','2.0')
with open('sample_file.txt','w') as file:
    file.write(filedata)

However the result is that all occurrences of "1.0" get replaced. Then I have to go back into the file, and correct the bug. The resultant file that I would like to get is:

####
# File section 1
####

2.0   abc   Description1
6.5   def   Description2
1.0 2.0 3.0   ghi   Description3
11    jkl   Description

####
# File section 2
####

2.0   abc   Description1
12.5   def   Description2
1.0 2.0 3.0   ghi   Description3
11    jkl   Description

#### End file

How can I get that? I couldn't find an example solution to this type of issue. Thank you all for your help.

EDIT: My fault for not clarifying, but the string I want to replace isn't always "1.0", nor always 3 characters long. It could be "-12.3", for example. I would like to make the code as generic as possible.

I also tried using rsplit to isolate the first string using space as a delimiter, but that doesn't seem to work for file writing.

========================

EDIT2: I found a way to do this, though it seems to be quite a round-about method:

with open('sample_file.txt','r') as file:
    filedata = file.readlines()
        for line in filedata:
            if 'abc' in line:
                oriline = line
                newline = line.replace(str(spk),str(newspk))
with open('sample_file.txt','r') as file:
    filedata = file.read()
    filedata = filedata.replace(str(oriline),str(newline))
with open('sample_file.txt','w') as file:
    file.write(filedata)

Basically, it would open the file, read line by line for the whole line that contains the specific string I want, then just store it into memory. Then open the file again, read everything, and just replace that whole string. Then open the file, and write the file.

It does what I want, but is there a way to simplify the code?

4
  • 2
    Use regular expressions to find the string pattern you want and to perform a replacement on part of that pattern. It's hard for us to give you a more specific answer without knowing exactly what the criteria are for what you want to replace (i.e., do you want to replace "1.0" only for entries labeled "abc"? do you want to replace "1.0" only if there aren't other numbers on the same line?) Commented Mar 2, 2019 at 17:24
  • I would only like to replace the "1.0" string for lines only with "abc" in them, as "abc" is a variable in the code that I am using. Commented Mar 2, 2019 at 18:34
  • Regarding EDIT2: There's no reason to read the file twice. You can modify elements of filedata in place and then use file.writelines on filedata. Also, there is only one oriline, so if "abc" appears twice in the same file (like in your example), it won't work. Additionally, you will perform a replacement if "abc" appears anywhere in the line (including the description) and will replace all occurrences of spk with newspk (whatever those are), not just in the first token. You also states that the string-to-be-replaced isn't a fixed string, which your approach doesn't handle. Commented Mar 3, 2019 at 18:14
  • What is wrong with either of the two approaches I suggested? They should handle the requirements that you stated. If they aren't suitable, please clarify why. Commented Mar 3, 2019 at 18:16

2 Answers 2

6

Just use

with open('sample_file.txt','r') as file:
    filedata = file.read()
    filedata = filedata.replace('1.0   abc','2.0   abc')
with open('sample_file.txt','w') as file:
    file.write(filedata)

Instead of the above shortcut, you can try a more generalized case by defining an empty list first :

li = []

and then use the code below ( considering the string abc is fixed as in your above case ) :

with open('sample_file.txt','r') as file:
for line in file:
        i = line.find('abc',1)
        if i >= 0:
              lineval = line.replace('1.0','2.0')
              li.append(lineval)
        else:
              lineval = line
              li.append(lineval)
j = 0                 
with open('sample_file.txt','w') as file:
    while j < len(li):
        file.write(li[j])           
        j += 1
Sign up to request clarification or add additional context in comments.

4 Comments

My fault for not clarifying, but the string I want to replace isn't always "1.0", nor always 3 characters long. It could be "-12.3", for example. I would like to make the code as generic as possible.
This might be good enough for the OP's case, but be aware that the second code sample will replace all occurrences of "1.0" if "abc" appears anywhere on the line.
@jamesdlin of course, deliberately i did so depending on a keyword in his sample. Thanks anyway.
I tried my own method (EDIT 2) with inspiration from you guys, and it seems to be working. Can you guys offer some critique on what ways I can improve the code?
0

As I mentioned in a comment, you can use regular expressions to match a pattern you're looking for. You can specify groups in the pattern (using (...) or (?P<name...)) to identify parts of the pattern and specifically replace or reuse those parts.

Something like this should work:

import re

pattern = (r'^' # The beginning of a line.
           # Match something that looks like a number:
           r'-?'        # 1. Optional: a negative sign.
           r'\d+'       # 2. One or more digits.
           r'([.]\d+)?' # 3. Optional: a decimal point followed by one
                        #    or more digits.
           # The rest of the line:
           r'(?P<rest>'
             r'\s+' # 1. One or more spaces.
             r'abc' # 2. "abc"
             r'\s+' # 3. One or more spaces.
             r'.*'  # 4. Everything remaining.
           r')' 
           r'$') # The end of a line.

# Replace the above pattern with "2.0" followed by whatever we identified
# as "the rest of the line".
replacement = '2.0\g<rest>'

with open('sample_file.txt','r') as file:
    filedata = file.read()

    # re.MULTILINE is needed to treat lines separately.
    filedata = re.sub(pattern, replacement, filedata, flags=re.MULTILINE)
with open('sample_file.txt','w') as file:
    file.write(filedata)

A different (untested) approach that doesn't use regular expressions:

with open('sample_file.txt','r') as file:
    lines = file.readlines()

with open('sample_file.txt','w') as file:
    for line in lines:
        tokens = line.split(maxsplit=2)
        try:
            if float(tokens[0]) and tokens[1] == 'abc':
                tokens[0] = '2.0'
        except (IndexError, ValueError):
            pass
        else:
            line = ' '.join(tokens)
        file.write(line)

Note that this isn't quite the same as the regular expression (RE) approach (notable differences are that it will accept any floating-point number as the first token (e.g. 1e-10) and that it won't preserve spaces after performing the replacement), but it might be a bit easier to understand if you're unfamiliar with REs.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.