30

How do you do a query-replace-regexp in Emacs that will match across multiple lines?

as a trivial example I'd want <p>\(.*?\)</p> to match

<p>foo
bar
</p>
4
  • 1
    I assume you saw emacswiki: emacswiki.org/emacs/MultilineRegexp Commented Aug 20, 2009 at 22:11
  • 1
    yeah i saw that but couldn't get it to work using query-replace-regexp. still trying though using re-builder to test it...hopefully i'll figure it out soon Commented Aug 20, 2009 at 22:21
  • 1
    The example is very bad, because parsing HTML with regular expressions is generally not a good idea. Commented Aug 20, 2009 at 22:48
  • 4
    Obviously there's a difference between trying to parse, eg de-serialize or scrape HTML with regex and using it to save time and typing while editing. Commented Feb 11, 2013 at 15:43

2 Answers 2

25
M-x re-builder

is your friend. And it led me to this regular expression:

"<p>\\(.\\|\n\\)*</p>"

which is the string version of

<p>\(.\|^J\)*</p>         ;# where you enter ^J by C-q C-j

And that works for me when I do re-search-forward, but not when I do 'query-replace-regexp. Unsure why...

Now, when doing a 're-search-forward (aka C-u C-s), you can type M-% which will prompt you for a replacement (as of Emacs 22). So, you can use that to do your search and replace with the above regexp.

Note, the above regexp will match until the last </p> found in the buffer, which is probably not what you want, so use re-builder to build a regexp that comes closer to what you want. Obviously regular expressions can't count parenthesis, so you're on your own for that - depends on how robust a solution you want.

Sign up to request clarification or add additional context in comments.

5 Comments

Are there info files for re-builder? I'm curious about how to use it.
Not that I can find. The Emacs Wiki doesn't have much on it either. But it's pretty self-explanatory (isn't all of Emacs :). After entering re-builder, type C-c C-h and you'll get a listing of bindings including those that apply to re-builder which all begin with C-c.
Yah, I got that far. Was just looking for something a bit more in depth. Thanks!
“Search failed with status 0: grep: Unmatched ( or \(“
@Marvin it's unclear what command exactly are you executing, but it's not something fully written in ELisp. grep is an external utility and AFAIK doesn't support multiline matches.
24

Try character classes. As long as you're using only ASCII character set, you can use [[:ascii:]] instead of the dot. Using the longer [[:ascii:][:nonascii:]] ought to work for everything.

4 Comments

And if you're not using just ASCII?
[[:ascii:][:nonascii:]]* gives me stack overflow
@helcim you should make it non-greedy by adding a ? at the end.
“Search failed with status 0: grep: Invalid character class name“

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.