1

I want to match all the numbers in a string given in scientific notation, heres my program

import re
txt = '2310163 -204.1154263 -204.1159668 -204.1110188 -204E-9668 200-100'
print re.findall('([+-]?\d+\.?[eE]?[+-]?\d*)', txt)
#                              ^    ^
#                             ex   sg
# allow sg only if its followed by ex

Now the 200-100 is not a valid number but the regex matches it, because I have given a [+-] for the exponent part. Now how to make regex so that it only checks for [+-] if it just followed by [eE] like the number -204E-9668 ?

2 Answers 2

1

Put the whole scientific notation part into an optional group, while matching the possible decimal part separately, before:

[+-]?\d+(?:\.\d+)?(?:[eE][+-]\d+)?
#        ^^^^^^^^^ optional decimals
#                 ^^^^^^^^^^^^^^^^ optional scientific notation

If you want none of the 200-100 part to match because the 200 is right next to the 100, then at the beginning, lookbehind for a space or the beginning of the string, and at the end, lookahead for a space or the end of the string:

(?:(?<=^)|(?<= ))[+-]?\d+(?:\.\d+)?(?:[eE][+-]\d+)?(?= |$)

https://regex101.com/r/SdA295/1

Sign up to request clarification or add additional context in comments.

4 Comments

the end should be (?= |$|\n) for working with python multiline string ?
Sure, or with \r\n as well, depending on where the input comes from
also the start should be (?<=^)|(?<=\s) there can be a tab before the number, i guess?
Sure, if that's what your input is like, you can also lookahead for just $ or \s
1

I think what you're looking for is a positive lookahead

(?=foo)

Here's a good resource on the topic: https://www.rexegg.com/regex-lookarounds.html

1 Comment

What question are you answering?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.