0

I am fetching some information from the sites and for example i am fetching address of some customers

address = ['Mr Thomas',
 '+(91)-9849633132, 9959455935',
 '+(91)-9849633132',
 '9196358485',
 '8846853128',
 '8-4-236/2']

From the above list i want to ignore strings starting with +(91) and 9 and 8 which are nothing but the phone numbers, so i used regular expressions as below

import re


result = [i for i in address if not re.match(r"[98]\B", i)]

result

['Mr Thomas','+(91)-9849633132, 9959455935','+(91)-9849633132','8-4-236/2']

That is the strings starting with 9 and 8 are ignored but i want to ignore the strings starting with +(91) too , can anyone please let me know how to do this.

0

3 Answers 3

1

Just add in another check for the +(91), using the | (or) operator. Like so:

>>> [i for i in address if not re.match(r"[98]\B|\+\(91\)\B", i)]
['Mr Thomas', '8-4-236/2']

Note that you have to escape +, (, and ) because those are special characters.

As an aside, it might be more efficient to use a filter, rather than a list comprehension:

>>> filter(lambda x: not re.match(r"[98]\B|\+\(91\)\B", x), address)
['Mr Thomas', '8-4-236/2']

Though I can't be sure.

Edit: Looks like it's not more efficient. However, I find it to be more self documenting, but you can take it as you will.

Sign up to request clarification or add additional context in comments.

2 Comments

I was just running timeit when you edited - regardless, any performance difference would be implementation-dependent, and this would probably be premature optimisation anyway. If performance was a problem, using re.compile might be a more fruitful optimisation.
@James I mostly just meant it as a second option. Thank you for confirming though.
0
result = [i for i in address if not re.match(r"\+[98]\B", i)]

2 Comments

Thats not working actually and i am getting the same result
@shivakrishna That should actually give you a result that doesn't match any of the strings in your list.
0

This does work:

 result = [i for i in s if not re.match(r'[+89][-()+0-9/\s]+',i)]

Why? The '\B' switch is detrimental here as the match MUST NOT occur at the beginning of the string. Additionally, the proposed search pattern allows for white space within the numbers.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.