2

In emacs lisp I only know the functions string-match[-p], but I know no method for matching a literal string to a string.

E.g. assume that I have a string generated by some function and want to know if another string contains it. In many cases string-match-p will work fine, but when the generated string contains regexp syntax, it will result in unexpected behaviour, maybe even crash if the regular expression syntax contained is invalid (e.g. unbalanced quoted parentheses \(, \)).

  1. Is the some function in emacs lisp, that is similiar to string-match-p but doesn't interpret regular expression syntax?
  2. As regexp-matching is implemented in C I assume that matching the correct regexp is faster than some substring/string= loop; Is there some method to escape an arbitrary string into a regular expression that matches that string and only that string?

3 Answers 3

8

Are you looking for regexp-quote?

The docs say:

(regexp-quote STRING)

Return a regexp string which matches exactly STRING and nothing else.

And I don't know that your assumption in #2 is correct, string= should be faster...

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that helped in most cases. Also, for a 68 character query string and a ≈ 300 char string to search in a string-contains implementation using regexp-quote was 40 times faster in compiled code, 60 times faster in uncompiled code. For "string-starts-with", "string-ends-with" however, a naive (string= (substring ...) ...) implementation was 4 times faster in compiled code (4 times slower in uncompiled code) for similiar input size.
4

Either use regexp-quote as recommended by @trey-jackson, or don't use strings at all.

Emacs is not optimized for string handling; it is optimized for buffers. So, if you manipulate text, you might find it faster to create a temporary buffer, insert your text there, and then use search-forward to find your fixed string (non-regexp) in that buffer.

1 Comment

On my system the overhead of with-temp-buffer was about 2e-6 seconds (0.2sec for executing (with-current-buffer (+ 1 1)) 10,000 times). While pretty negligible, for short strings (≈10-30 characters) that overhead is by a factor of ≈100-1000 bigger than the time the string-functions (builtins I tried and some I wrote myself) take to execute. I'll believe you any time though, that for long strings buffers give the better performance, so thanks for the answer!
0

Perhaps cl-mismatch, an analogue to Common Lisp mismatch function? Example usage below:

(mismatch "abcd" "abcde")
;; 4
(mismatch "abcd" "aabcd" :from-end t)
;; -1
(mismatch "abcd" "aabcd" :start2 1)
;; nil

Ah, sorry, I didn't understand the question the first time. If you want to know whether the string is a substring of another string (may start at any index in the searched string), then you could use cl-search, again, an analogue of Common Lisp search function.

(search "foo\\(bar" "---foo\\(bar")
;; 3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.