0

I have two lists:

list_1 = ["TP", "MP"]

list_2 = ["This is ABC12378TP0892S3", "This is XYZ12378MP0892S3"]

I want take elements from list_1 and search in strings of list_2. If found (for example TP is present in list_2's first string, MP is present in list_2's second string), remove what is to the right of TP, MP etc. and insert space to left of it.

I tried the below with re, but it is removing only the right part:

[ re.sub(r'(' +  '|'.join(list_1) + ')\d+', r'\1', string) for string in list_2 ] 

2 Answers 2

1

You could compile a regular expression as follows, and then use it to do a sub() on each list entry:

import re

list_1 = ["TP", "MP"]
list_2 = ["This is ABC12378TP0892S3", "This is XYZ12378MP0892S3", "SDTP This is ABC12378TP0892S3"]    

re_sub = re.compile(r'(.*\b\w+)({}).*'.format('|'.join(list_1))).sub
list_2 = [re_sub(r'\1 \2', t) for t in list_2]

print list_2

This would display:

['This is ABC12378 TP', 'This is XYZ12378 MP', 'SDTP This is ABC12378 TP']

In this example, the search pattern being used is:

(.*\b\w+)(TP|MP).*
Sign up to request clarification or add additional context in comments.

2 Comments

Could please explain this [re_sub(r'\1 \2', t) for t in list_2]. How exactly it is working?
The search pattern has two sets of (....). When using sub(), the \1 means to substitute the current location with the contents of the first (), and \2, the contents of the second ().
0

I think you were close. Add the space... r' \1'

Not sure about \d+, either, so replace that with .*

>>> [ re.sub(r'(' +  '|'.join(list_1) + ').*', r' \1', string) for string in list_2 ]
['This is ABC12378 TP', 'This is XYZ12378 MP']

6 Comments

@cricket-007 What if I have two substrings and I only want to insert space in one.Example list_1 = ["TP", "MP"] list_2 = ["SDTP This is ABC12378TP0892S3", "This is XYZ12378MP0892S3"] and I don"t want to insert space for SDTP.
If that was your question, then you should have asked that from the start... You can't do the list comprehension, then, because you would need ' \2' instead of ' \1'
@cricket-007 SDTP can occur anywhere in the string. Could you please elaborate ?
Sure, but for the regex replacement, order matters... "SDTP This is ABC12378TP", there are two. If you want the space on the "ABC12378TP" part, then you have to do more work (or it just isn't possible)
@cricket-007 Is there anyway to use Quantifiers for this? Or if it is fixed SDTP occurs as first position in the string then?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.