2

I need some analog of Python method of Regexp object - search. It has three arguments: text, start position and end position and returns Match object that has start and end fields.

I've got a function, that returns Match object, but I have no Idea how to implement endIndex in this function. I'm worring about performance and very reluctant to use substring method. Is there a feature that can be used in my case within Javascript ? Another question is there a library that provides the API similar to Python re module ?

   function search(str, startIndex, endIndex) {
    var re = new RegExp(this.matcher.source, 'g' + (this.matcher.ignoreCase ? 'i' : '') + (this.matcher.multiLine ? 'm' : ''));

    re.lastIndex = startIndex || 0;
    var value = re.exec(str);

    if (!value)
        return null;

    var start = re.lastIndex - value[0].length;
    var end = re.lastIndex;

    return new Match(start, end);
}
3
  • 1
    What are you trying to do exactly? Perform a regex search on a specific part of the string? You should just use substring, its not going to be a performance problem Commented Aug 16, 2013 at 15:31
  • I need to port some Python text search engine into JavaScript. In this function I need to perform regexp search within specific part of string between some indices. Commented Aug 16, 2013 at 15:52
  • then try substring and see how it performs, just put this at the start of your function: str = str.substring(startIndex, endIndex); then do the rest (well maybe validate starIndex and endIndex first) Commented Aug 16, 2013 at 15:58

1 Answer 1

2

Since the javascript RegExp object does not offer any in-built substring capabilities and javascript does not allow any pointer magic you have no choice but to use substring. However, unless you are expecting gigantic strings I wouldn't worry too much substring's performance. Substring is basically a memory copy which is an incredibly optimized operation at the hardware level (think L1-3 caches, cpu extensions that allow copying 128 bits per clock cycle, etc).

Just for my amusement I offer some creative alternatives to substring:

  1. Keep your lastIndex trick, but add `.{m, n}$' to the end of your regex:

    • let m be str.length - endIndex.
    • and let n be str.length - lastIndex
  2. use a regex engine written in javascript that has in-built substring scanning.

  3. submit an rfc to Ecma International.

Sign up to request clarification or add additional context in comments.

2 Comments

Could you explain a bit what is .{n}$ ? I can't understand how it works.
in the following example: regex101.com/r/eA7cT0 the string 'ab' is matched only if at least 3 characters follow to the end of the string. This has the same effect as as cutting the string to a length of 3 chars and simply matching 'ab'. It's a completely ridiculous, doens't perform at all, but works :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.