0

I have a very large string consiting of a series of numbers separated by one or more spaces. Some of the numbers are equal to -123, and the rest can be any random number.

example_string = "102.3  42.89  98  812.7  374  5  -123  8  -123  13  -123  21..."

I would like to replace the values that are not equal to -123 with 456 in the most efficient way possible.

updated_example_string = "456  456  456  456  456  456  -123  456  -123  456  -123  456..."

I know that python's regular expression library has a sub method that will replace matching values quite efficiently. Is there a way to replace values that DO NOT match? As I mentioned, this is a rather large string, coming from a source file around 100MB. Assuming there's a way to use re.sub to accomplish this task, is that even the correct/most efficient way of handling such problem?

1

2 Answers 2

2

You can use this regex:

(^|\s)(?!-123(\s|$))-?[0-9.]+(?=\s|$)

It looks for the start of string or a space, not followed by -123 and space of end of string (using a negative lookahead) then some number of digits or a ., followed by either a space or end of string.

Then you can replace with \g<1>456 to turn all those numbers into 456. The \g<1> in the replacement preserves any space captured by the first group.

Demo on regex101

In Python:

import re
string = "102.3  42.89 -1234 98  -812.7  374  5  -123  8  -123  13  -123  21 -123"
print re.sub(r'(^|\s)(?!-123(\s|$))-?[0-9.]+(?=\s|$)', '\g<1>456', string)

Output

456  456 456 456  456  456  456  -123  456  -123  456  -123  456 -123

Demo on rextester

Sign up to request clarification or add additional context in comments.

Comments

1

You could match only the numbers between whitspace boundaries and the use re.sub with a callback function to check if the match is not -123. If it not, relace it with 456

(?<!\S)-?\d+(?:\.\d+)?(?!\S)

Explanation

  • (?<!\S) Negative lookbehind to assert what is on the left is not a non-whitespace character
  • -? Optional -
  • \d+(?:\.\d+)? Match 1+ digits with an optional part that matches a . and 1+ digits
  • (?!\S) Negative lookahead to assert what is on the right is not a non-whitespace character

Example

import re
pattern = r"(?<!\S)-?\d+(?:\.\d+)?(?!\S)"
s = "102.3  42.89  98  812.7  374  5  -123  8  -123  13  -123  21"

print(re.sub(pattern, lambda m: "456" if m.group() != "-123" else m.group(), s))

Result

456  456  456  456  456  456  -123  456  -123  456  -123  456

See the Regex demo | Python demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.