2

I have a python string:

text 5018.741043 57875.266717658247500  16.826  gbt  -chan 0 -subint 0 -snr 44.932

I know I can find 44.932 using:

r'.*(\b\d+\.\d+)'

But I want to find the penultimate \d+\.\d+ value, i.e. 16.826.

How can I do that please?


I have many lines similar to this example, but they may be slightly different in terms of spacing and number of characters, which is why I thought I should use regex.

Also, I ultimately want to substitute this value (here 16.826) for another number.

Thanks.

4
  • 1
    Do you absolutely need a regex? Why not a buffered queue? Commented Aug 26, 2019 at 22:03
  • 2
    If the file is fixed-width, just a string slice is probably sufficient. Please give more motivation and context for this to avoid the x-y problem. Commented Aug 26, 2019 at 22:09
  • Thanks. I have many lines similar to this, but they may be slightly different in terms of spacing and number of characters, which is why I thought I should use regex. Commented Aug 26, 2019 at 22:10
  • 1
    Then it would be very helpful to provide these lines, otherwise, you'll get naive answers that rightfully show a simple way to get your desired output, like the one below. It's better to present your entire problem, without presupposing that regex is the best way to solve it (but it's good to present it as your attempt!). Show all possible lines you expect, or at least a representative sample. Thanks. Commented Aug 26, 2019 at 22:11

4 Answers 4

3
s = 'text 5018.741043 57875.266717658247500  16.826  gbt  -chan 0 -subint 0 -snr 44.932'

s.split(' ')[4]
# '16.826'

s.split(' ')[-1]
# '44.932'
Sign up to request clarification or add additional context in comments.

Comments

2

I think the best method here is to create a temporary string without the last decimal number and then find the new last decimal number in the temporary string.

The issue here is if you have two decimal numbers that are exactly the same in the string - if this is the case a different method to remove the data from the string will be required.


def second_decimal(text):
    newstr = text.replace(re.findall(r'.*(\b\d+\.\d+)',text)[0], "")
    return re.findall(r'.*(\b\d+\.\d+)',newstr)

Comments

2

Assuming that there is a whitespace after penultimate decimal you can find it with r'.*(\b\d+\.\d+) ':

import re

s = 'text 5018.741043 57875.266717658247500  16.826  gbt  -chan 0 -subint 0 -snr 44.932'
r = r'.*(\b\d+\.\d+) '
print(re.findall(r, s))  # ['16.826']

1 Comment

r'.*\b(\d+\.\d+)\b(?=.*?\d+\.\d+)' might be a bit safer if there is whitespace or other text content at the end of the line.
1

One option is to make use of backtracking and a capturing group by first matching until the end of the string, then capture the penultimate one in a capturing group and then match the last occurrence.

^.*\b(\d+\.\d+)\b.*\b\d+\.\d+\b

Explanation

  • ^ Start of string
  • .* Match any char except a newline
  • \b(\d+\.\d+)\b Match digits with a decimal part surrounded by word boundaries
  • .* Match any char except a newline
  • \b(\d+\.\d+)\b Match digits with a decimal part surrounded by word boundaries

Regex demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.