2

I am working with Grammatical Evolution (GE) on Python 3.7. My grammar generates executable strings in the format:

np.where(<variable> <comparison_sign> <constant>, (<probability1>), (<probability2>))

Yet, the string can get quite complex, with several chained np.where .

<constant> in some cases contains leading zeros, which makes the executable string to generate errors. GE is supposed to generate expressions containing leading zeros, however, I have to detect and remove them. An example of a possible solution containing leading zeros:

"np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))"

Problem:

  • There are two types of numbers containing leading zeros: int and float.
  • Supposing that I detect "02" in the string. If I replace all occurrences in the string from "02" to "2", the float "01.5025" will also be changed to "01.525", which cannot happen.

I've made several attempts with different re patterns, but couldn't solve it. To detect that an executable string contains leading zeros, I use:

try:
  _ = eval(expression)
except SyntaxError:
  new_expression = fix_expressions(expression)

I need help building the fix_expressions Python function.

2
  • is this a solution for you: stackoverflow.com/questions/13142347/… ? Commented Nov 14, 2020 at 18:05
  • Partially. Two things missing: detect solely numbers with leading zeros and replace only those occurrences, without changing other numbers that contain them partially. Example: replacing "02" with "2" without changing "0.025" to "0.25". Commented Nov 14, 2020 at 18:33

2 Answers 2

1

You could try to come up with a regular expression for numbers with leading zeros and then replace the leading zeros.

import re

def remove_leading_zeros(string):
    return re.sub(r'([^\.^\d])0+(\d)', r'\1\2', string)

print(remove_leading_zeros("np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))"))

# output: np.where(x < 2, np.where(x > 1.5025, (0.9), (0.5)), (1))

The remove_leading_zeros function basically finds all occurrences of [^\.^\d]0+\d and removes the zeros. [^\.^\d]0+\d translates to not a number nor a dot followed by at least one zero followed by a number. The brackets (, ) in the regex signalize capture groups, which are used to preserve the character before the leading zeros and the number after.


Regarding Csaba Toth's comment:

The problem with 02+03*04 is that there is a zero at the beginning of the string. One can modify the regex such that it matches also the beginning of the string in the first capture group:

r"(^|[^\.^\d])0+(\d)"
Sign up to request clarification or add additional context in comments.

2 Comments

Perfect! Thank you very much :)
One thing this lacks is that it leaves a leading 0 in the string: 02+03*04 would become 02+3*4.
0

You can remove leading 0's in a string using .lstrip()

str_num = "02.02025"

print("Initial string: %s \n" % str_num)

str_num = str_num.lstrip("0")

print("Removing leading 0's with lstrip(): %s" % str_num)

7 Comments

This is applicable to cases where I have a string containing only numbers. Notice that my string expressions are more complex. Even if I can detect only numbers containing leading zeros, how can I replace them in the whole expression, without affecting other numbers? Example: replacing "02" with "2" without changing "0.025" to "0.25".
If your goal is to remove leading 0's lstrip() doesn't mind if the string is all numbers or both numbers and characters. if str_num was equal to 02.025 it will return 2.025
I tried: "np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))".lstrip("0") and i got: "np.where(x < 02, np.where(x > 01.5025, (0.9), (0.5)), (1))" So either I can't explain my problem, or I don't understand your suggestion...
the string that represents <constant> is what you would want to use the lstrip("0") function on. However, maybe I am misunderstanding what you're asking.
My problem is to detect only those occurrences in the whole string and replace only those ones. Example: I detect "02", apply lstrip() and replace it in the entire string with str.replace("02", "2"). By doing this, the number "0.025" will also be replaced with "0.25", since "02" is a substring of "0.025". Right?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.