0

I try to get miles and chains integer values from a string like "at (17.08)". The false decimal form of the input data is the choice of the data administrator. First I tried the following pattern on this string "17.08":

"((\d+)\.(\d\d))"

This behaved correctly:

group(0) is "17.08"
group(1) is "17.08"
group(2) is "17"
group(3) is "08"

but now for the "at (17.08) is a" variation: I want to be able to substitute a correctly formatted location for the "decimal" notation, with this pattern:

".*\(?((\d+)\.(\d\d))\)?.*"

When queried with re.match I get the following match groups:

group(0) is "(17.08)", OK.
group(1) is "7.08", where is the 1 going ?
group(2) is "7", where is the 1 going ?
group(3) is "08", still OK.

What am I doing wrong ? Why does "re" behave like this ? I have an idea this must be related to the "greedy"/"non-greedy" theme, but how ?

3
  • You're using capture groups, you should use search if you just intend to have one output. Use non capturing groups: (?:stuff) will make sure you don't capture subgroups within the regex itself. Commented Nov 26, 2018 at 16:43
  • What exactly are you try to extract? The 17, the 08, or 17.08? Also, on my machine, copying your pattern and example gives 7.08, not 17.08 as group 1. Commented Nov 26, 2018 at 16:46
  • @Tomothy32: I want to extract the 17 and the 08, but having the form 17.08 as well makes a string replace possible. Also, your try is indeed giving the correct answer for the second pattern. Sorry for not being clear. Commented Nov 26, 2018 at 16:52

1 Answer 1

1

The reason is the .*\(? part at the prefix absorbed the 1 in the input, as the opening bracket is optional and .* is greedy. My way of solving the issue is using the following regex instead (note the space after first *):

".* \(?((\d+)\.(\d\d))\)?.*"

Assuming you always have a space before the opening bracket (if present) or the number.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot for this answer. It did the trick, and indeed the parenthesised form is always part of a comment, so that works out all right.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.