How to parse a Javascript regexp in Python?

Question

First of all, I am not the one who is writing the regexps, so I can't just rewrite them. I am pulling in several Javascript regexps, and trying to parse them, but there seems to be some difference between them. Testing the example regexp on W3Schools, Javascript shows this:

var str="Visit W3Schools";
var patt1=/w3schools/i;
alert(str.match(patt1))

which alerts "W3Schools". However, in Python, I get:

import re
str="Visit W3Schools"
patt1=re.compile(r"/w3schools/i")
print patt1.match(str)

which prints None. Is there some library I can use to convert the Javascript regexps to Python ones?

Look up .match vs. .search.

Martijn Pieters
– Martijn Pieters

2012-06-27 16:19:51 +00:00
Commented Jun 27, 2012 at 16:19 — Martijn Pieters
– Martijn Pieters, Commented Jun 27, 2012 at 16:19
Please be careful using w3schools.

Pointy
– Pointy

2012-06-27 16:23:08 +00:00
Commented Jun 27, 2012 at 16:23 — Pointy
– Pointy, Commented Jun 27, 2012 at 16:23

Martijn Pieters · Accepted Answer · 2012-06-27 16:32:21Z

4

In python .match only matches at the start of the string.

What you want to use is instead is .search.

Moreover, you do not need to include the '/' characters, and you need to use a separate argument to re.compile to make the search case insensitive:

>>> import re
>>> str = "Visit W3Schools"
>>> patt1 = re.compile('w3schools', re.I)
>>> print patt1.search(str)
<_sre.SRE_Match object at 0x10088e1d0>

In JavaScript, the slashes are the equivalent of calling re.compile.

I can recommend reading up on the python regular expression module, there is even an excellent HOWTO.

edited Jun 27, 2012 at 16:32

answered Jun 27, 2012 at 16:22

Martijn Pieters

1.1m326 gold badges4.2k silver badges3.4k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jon Clements · Accepted Answer · 2012-06-27 17:00:57Z

2

Could write a small helper function so /ig could also work:

def js_to_py_re(rx):
    query, params = rx[1:].rsplit('/', 1)
    if 'g' in params:
        obj = re.findall
    else:
        obj = re.search

    # May need to make flags= smarter, but just an example...    
    return lambda L: obj(query, L, flags=re.I if 'i' in params else 0)

print js_to_py_re('/o/i')('school')
# <_sre.SRE_Match object at 0x2d8fe68>

print js_to_py_re('/O/ig')('school')
# ['o', 'o']

print js_to_py_re('/O/g')('school')
# []

answered Jun 27, 2012 at 17:00

Jon Clements

143k34 gold badges254 silver badges288 bronze badges

2 Comments

RJH Over a year ago

This doesn't work for regexes with named groups. Unfortunately the JS style for named groups is different from that in Python.

Jon Clements Over a year ago

@RJH not sure I'm following - could you provide an example of where it wouldn't work?

mrb · Accepted Answer · 2012-06-27 16:24:08Z

1

You don't want to include the / characters and flags in the regexp, and you should use .search instead of .match for a substring match.

Try:

patt1 = re.compile(r"w3schools", flags=re.IGNORECASE)
srch = patt1.search(str)
print srch.group()

answered Jun 27, 2012 at 16:24

mrb

3,3381 gold badge24 silver badges31 bronze badges

Collectives™ on Stack Overflow

How to parse a Javascript regexp in Python?

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related