0

I have this string.

a='SAD; Happy; ING:train coca'
OR
a='SAD; Happy; ING(train coca'
OR
a='SAD, Happy, ING[train coca'

I need to detect this string : "; ING:" for that I use this regex :

listRE=re.findall(r';\s*[A-Z]+\s*[\:|\[|\(]\s*[A-Z]+', a)

What i need to do is to delete what is between ; and : (not always ; : as shown in the regex)

I do that

for i in listRE:
   p=re.compile(i)
   a=re.sub(p, r'', a)

but it s deleting my text !! my target is :

a='SAD; Happy; train coca'

your help is more than welcome Thank you for your help

1

3 Answers 3

1

This does the job:

import re

strs = [
    'SAD; Happy; ING:train coca',
    'SAD; Happy; ING(train coca',
    'SAD, Happy, ING[train coca',
]
for str in strs:
    x = re.sub(r'(?<=[;,])\s+[A-Z]+[:([]', ' ', str)
    print x

Output:

SAD; Happy; train coca
SAD; Happy; train coca
SAD, Happy, train coca

Demo & explanation

Sign up to request clarification or add additional context in comments.

Comments

0

You don't need to use findall – you can use a regex pattern directly that matches all cases you need. I've also fixed up some of your regex:

import re
a = 'SAD; Happy; ING:train coca'
b = "SAD; Happy; ING(train coca"
c = "SAD, Happy, ING[train coca"

print(re.sub(r'(?<=;|,)(\s*)[^:[(;,]*[:[(]', r'\1', a))
print(re.sub(r"(?<=;|,)(\s*)[^:[(;,]*[:[(]", r"\1", b))
print(re.sub(r"(?<=;|,)(\s*)[^:[(;,]*[:[(]", r"\1", c))

"""
output:
SAD; Happy; train coca
SAD; Happy; train coca
SAD, Happy, train coca
"""

3 Comments

Thank you for your help, but that s not exactly what i want.
@Gandalf could you clarify then?
yes I will explain : I have 3 cases CASE 1 ;or, art1 : art2 art3 ==> ;or, art2 art3 \n CASE2 ;or, art1 ( art2 art3 ) ==> ;or, art2 art3 \n CASE 3 ;or, art1 [ art2 art3 ] ==> ;or, art2 art3
0

If you also want to match the strings from the comments, you might use

\s+\w+\s?[:([]\s*

In the replacement use a space.

Regex demo | Python demo


If you can match either a colon or from an opening till closing bracket afterwards, you might use an alternation matching either the : or use 2 capturing group where you would match the content to keep between the brackets [...] and (...)

\s+\w+\s?(?::|\(\s*([^()]+)\s*\)|\[\s*([^]\[]+)\s*])\s*

In the replacement use a space and both capturing groups r' \1\2'

Regex demo | Python demo

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.