String matching in emacs lisp matching arbitary string

Question

In emacs lisp I only know the functions string-match[-p], but I know no method for matching a literal string to a string.

E.g. assume that I have a string generated by some function and want to know if another string contains it. In many cases string-match-p will work fine, but when the generated string contains regexp syntax, it will result in unexpected behaviour, maybe even crash if the regular expression syntax contained is invalid (e.g. unbalanced quoted parentheses \(, \)).

Is the some function in emacs lisp, that is similiar to string-match-p but doesn't interpret regular expression syntax?
As regexp-matching is implemented in C I assume that matching the correct regexp is faster than some substring/string= loop; Is there some method to escape an arbitrary string into a regular expression that matches that string and only that string?

Trey Jackson · Accepted Answer · 2013-06-18 17:37:14Z

8

Are you looking for regexp-quote?

The docs say:

(regexp-quote STRING)

Return a regexp string which matches exactly STRING and nothing else.

And I don't know that your assumption in #2 is correct, string= should be faster...

answered Jun 18, 2013 at 17:37

Trey Jackson

74.8k11 gold badges204 silver badges233 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

kdb Over a year ago

Thanks, that helped in most cases. Also, for a 68 character query string and a ≈ 300 char string to search in a string-contains implementation using regexp-quote was 40 times faster in compiled code, 60 times faster in uncompiled code. For "string-starts-with", "string-ends-with" however, a naive (string= (substring ...) ...) implementation was 4 times faster in compiled code (4 times slower in uncompiled code) for similiar input size.

sds · Accepted Answer · 2013-06-18 22:33:33Z

4

Either use regexp-quote as recommended by @trey-jackson, or don't use strings at all.

Emacs is not optimized for string handling; it is optimized for buffers. So, if you manipulate text, you might find it faster to create a temporary buffer, insert your text there, and then use search-forward to find your fixed string (non-regexp) in that buffer.

answered Jun 18, 2013 at 22:33

sds

60.5k31 gold badges178 silver badges303 bronze badges

1 Comment

kdb Over a year ago

On my system the overhead of with-temp-buffer was about 2e-6 seconds (0.2sec for executing (with-current-buffer (+ 1 1)) 10,000 times). While pretty negligible, for short strings (≈10-30 characters) that overhead is by a factor of ≈100-1000 bigger than the time the string-functions (builtins I tried and some I wrote myself) take to execute. I'll believe you any time though, that for long strings buffers give the better performance, so thanks for the answer!

score 0 · Accepted Answer · 2013-06-19 16:02:19Z

0

Perhaps cl-mismatch, an analogue to Common Lisp mismatch function? Example usage below:

(mismatch "abcd" "abcde")
;; 4
(mismatch "abcd" "aabcd" :from-end t)
;; -1
(mismatch "abcd" "aabcd" :start2 1)
;; nil

Ah, sorry, I didn't understand the question the first time. If you want to know whether the string is a substring of another string (may start at any index in the searched string), then you could use cl-search, again, an analogue of Common Lisp search function.

(search "foo\\(bar" "---foo\\(bar")
;; 3

edited Jun 19, 2013 at 16:02

answered Jun 18, 2013 at 19:43

user797257

Collectives™ on Stack Overflow

String matching in emacs lisp matching arbitary string

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related