1

First of all, I am not the one who is writing the regexps, so I can't just rewrite them. I am pulling in several Javascript regexps, and trying to parse them, but there seems to be some difference between them. Testing the example regexp on W3Schools, Javascript shows this:

var str="Visit W3Schools";
var patt1=/w3schools/i;
alert(str.match(patt1))

which alerts "W3Schools". However, in Python, I get:

import re
str="Visit W3Schools"
patt1=re.compile(r"/w3schools/i")
print patt1.match(str)

which prints None. Is there some library I can use to convert the Javascript regexps to Python ones?

2

3 Answers 3

4

In python .match only matches at the start of the string.

What you want to use is instead is .search.

Moreover, you do not need to include the '/' characters, and you need to use a separate argument to re.compile to make the search case insensitive:

>>> import re
>>> str = "Visit W3Schools"
>>> patt1 = re.compile('w3schools', re.I)
>>> print patt1.search(str)
<_sre.SRE_Match object at 0x10088e1d0>

In JavaScript, the slashes are the equivalent of calling re.compile.

I can recommend reading up on the python regular expression module, there is even an excellent HOWTO.

Sign up to request clarification or add additional context in comments.

Comments

2

Could write a small helper function so /ig could also work:

def js_to_py_re(rx):
    query, params = rx[1:].rsplit('/', 1)
    if 'g' in params:
        obj = re.findall
    else:
        obj = re.search

    # May need to make flags= smarter, but just an example...    
    return lambda L: obj(query, L, flags=re.I if 'i' in params else 0)

print js_to_py_re('/o/i')('school')
# <_sre.SRE_Match object at 0x2d8fe68>

print js_to_py_re('/O/ig')('school')
# ['o', 'o']

print js_to_py_re('/O/g')('school')
# []

2 Comments

This doesn't work for regexes with named groups. Unfortunately the JS style for named groups is different from that in Python.
@RJH not sure I'm following - could you provide an example of where it wouldn't work?
1

You don't want to include the / characters and flags in the regexp, and you should use .search instead of .match for a substring match.

Try:

patt1 = re.compile(r"w3schools", flags=re.IGNORECASE)
srch = patt1.search(str)
print srch.group()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.