Replacing a certain character with below pattern using RegEx in python

Question

I have strings as below:

s1 = "My email Id is abcd@g mail.com"
s2 = "john@ hey.com is my email id"
s3 = "id is rock@gmail .com"
s4 = "The id is sam @yahoo.in"

I have to replace the blank space in email ID using regex. How can I achieve this?

I tried

s = re.sub(r'@\w*[\s]+[\w]*\.', r'', s1)

which is giving me output as:

'My email Id is abccom'

Output should be:

'My email Id is [email protected]'

I'm not sure how can I replace only blank value with re.sub.

Any Suggestions are welcome

Thanks,

blhsing · Accepted Answer · 2018-08-02 09:42:01Z

2

You can use a callable to remove spaces after matching email addresses with spaces using re.sub.

import re
l = [
    "My email Id is abcd@g mail.com",
    "john@ hey.com is my email id",
    "id is rock@gmail .com",
    "The id is sam @yahoo.in"
]
for s in l:
    print(re.sub(r'[\w.-]+ ?@(?:[\w-]+\.[\w -]+|[\w -]+\.[\w-]+)', lambda e: e[0].replace(' ', ''), s))

This outputs:

My email Id is [email protected]
[email protected] is my email id
id is [email protected]
The id is [email protected]

edited Aug 2, 2018 at 9:42

answered Aug 2, 2018 at 9:24

blhsing

109k9 gold badges88 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Sociopath Over a year ago

I made small change in the question. It works fine if email id is at the end but it's removing spaces between other words to if email id is not at the end. See 2nd string the question.

blhsing Over a year ago

I see. Edited my answer accordingly then.

Andrej Kesely · Accepted Answer · 2018-08-02 09:35:40Z

1

You can use back references in re.sub (online regex here):

import re

data = [
"My email Id is abcd@g mail.com",
"Email Id: defg@yah oo.com",
"id is rock@gmail .com"
]

for s in data:
    print(re.sub(r'(@.*)(\s+)(.*)', r'\1\3', s))

Prints:

My email Id is [email protected]
Email Id: [email protected]
id is [email protected]

EDIT:

If the blank space is before the @, the regexp is a little bit tricky (to not match e.g. "aaa bbb ccc [email protected]", online regex here):

import re

data = [
"My email Id is ab [email protected]",
"Email Id: def [email protected]",
"id is roc [email protected]",
"aaa bbb ccc [email protected]"
]

for s in data:
    print(re.sub(r'(?=is|:)(.*)\s+(.*@.*)', r'\1\2', s))

Prints:

My email Id is [email protected]
Email Id: [email protected]
id is [email protected]
aaa bbb ccc [email protected]

Now we can combine these regexes:

import re

data = [
"My email Id is ab [email protected]",
"Email Id: def g@ya hoo.com",
"id is roc k@gm ail.com",
"aaa bbb ccc [email protected]"
]

for s in data:
    s = re.sub(r'(@.*)\s+(.*)', r'\1\2', s)
    s = re.sub(r'(?=is|:)(.*)\s+(.*@.*)', r'\1\2', s)
    print(s)

Will print:

My email Id is [email protected]
Email Id: [email protected]
id is [email protected]
aaa bbb ccc [email protected]

edited Aug 2, 2018 at 9:35

answered Aug 2, 2018 at 9:17

Andrej Kesely

196k15 gold badges60 silver badges105 bronze badges

3 Comments

Sociopath Over a year ago

I have edited my question with one more condition if the blank space is before @

Sociopath Over a year ago

it's not handling if email id is not at the end. See 2nd string in my question.

Andrej Kesely Over a year ago

@AkshayNevrekar Just combine these regexes, see my updated answer

Collectives™ on Stack Overflow

Replacing a certain character with below pattern using RegEx in python

2 Answers 2

2 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related