2

I have multiple flagged strings:

FLGSTdata1FLGEN
FLGSTdata2FLGEN
...

where FLGST is start flag and FLGEN is end flag.

I combine those strings and add some garbage data, so it looks like this:

garbagegarbageFLGSTdata1FLGENFLGSTdata2FLGENgarbagegarbageFLGSTdata3FLGEN...

I need to get each of flagged strings from the combined strings.

Here is what I've done using re:

>>> pattern = r'5354([A-Za-z0-9_]*)454E' #FLGST = 5354 and FLGEN = 454E
>>> data = re.findall(pattern,stringWithGarbage)
>>> print data[0]
data1FLGENFLGSTdata2FLGENgarbagegarbageFLGSTdata3

It returns all data except the FLGST of data1 and FLGEN of data last.

So, how do you get each of flagged string from stringWithGarbage?

The appropriate return would be:

[data1, data2, data3, ...]
1

1 Answer 1

1

using positive look behind and positive look ahead

strg = "garbagegarbageFLGSTdata1FLGENFLGSTdata2FLGENgarbagegarbageFLGSTdata3FLGEN"
pattern = re.compile(r'(?<=FLGST)(\S*?)(?=FLGEN)')
re.findall(pattern, strg)

Output

['data1', 'data2', 'data3']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.