1

Need to create a string based on a given pattern.

If the pattern is 222243243 string need to be created is "2{4,6}[43]+2{1,3}[43]+". Logic to create the above string is, check how many 2's sets in pattern and count them and add more two 2's .here contains two sets of 2's. The first one contains 4 2's and the seconds part contains 1 2's. So the first 2's can be 4 to 6(4+2) 2's and seconds 2's can be 1 to 3(1+2). when there are 3's or 4's, [43]+ need to add.

workings:

import re
data='222243243'
TwosStart=[]#contains twos start positions
TwosEnd=[]#contains twos end positions
TwoLength=[]#number of 2's sets

for match in re.finditer('2+', data):
    s = match.start()#2's start position
    e = match.end()#2's end position
    d=e-s
    print(s,e,d)
    TwosStart.append(s)
    TwosEnd.append(e)
    TwoLength.append(d)

So using the above code I know how many 2's sets are in a given pattern and their starting and ending positions. but I have no idea to automatically create a string using the above information.

Ex:

if pattern '222243243' string should be "2{4,6}[43]+2{1,3}[43]+"

if pattern '222432243' string should be "2{3,5}[43]+2{2,4}[43]+"

if pattern '22432432243' string should be "2{2,4}[43]+2{1,3}[43]+2{2,4}[43]+"

2 Answers 2

2

One approach is to use itertools.groupby:

from itertools import groupby

s = "222243243"

result = []
for key, group in groupby(s, key=lambda c: c == "2"):
    if key:
        size = (sum(1 for _ in group))
        result.append(f"2{{{size},{size+2}}}")
    else:
        result.append("[43]+")

pattern = "".join(result)
print(pattern)

Output

2{4,6}[43]+2{1,3}[43]+
Sign up to request clarification or add additional context in comments.

Comments

1

Using your base code:

import re
data='222243243'
cpy=data
offset=0 # each 'cpy' modification offsets the addition

for match in re.finditer('2+', data):
    s = match.start() # 2's start position
    e = match.end() # 2's end position
    d = e-s
    regex = "]+2{" + str(d) + "," + str(d+2) + "}["
    cpy = cpy[:s+offset] + regex + cpy[e+offset:]
    offset+=len(regex)-d

# sometimes the borders can have wrong characters
if cpy[0]==']':
        cpy=cpy[2:] # remove "]+"
else:
        cpy='['+cpy

if cpy[len(cpy)-1]=='[':
        cpy=cpy[:-1]
else:
        cpy+="]+"

print(cpy)

Output

2{4,6}[43]+2{1,3}[43]+

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.