Emacs: Regex Tutorial

By Xah Lee. Date: . Last updated: .

This page is a tutorial on emacs regex. Regex lets you find text patterns.

Regex Commands

The most commonly used command that uses regex is query-replace-regexp. 〔➤see Emacs: Find and Replace Commands

Others userful ones are:

There are many others. You can list them all by calling apropos-command, then type “regex”.

Emacs Regex Syntax

Here's commonly used patterns.

.Any single character except newline ("\n").
\.One period
[0-9]+One or more digits
[^0-9]+Skip digits
[A-Za-z]+one or more letters
[-A-Za-z0-9]+one or more {letter, digit, hyphen}
[_A-Za-z0-9]+one or more {letter, digit, underscore}
[-_A-Za-z0-9]+one or more {letter, digit, hyphen, underscore}
[[:ascii:]]+one or more ASCII chars. (codepoint 0 to 127, inclusive)
[[:nonascii:]]+one or more none-ASCII characters (For example, Unicode characters)
[\n\t ]+one or more {newline character, tab, space}.
"\([^"]+\)"capture text between double quotes.
+match previous pattern 1 or more times
*match previous pattern 0 or more times
?match previous pattern 0 or 1 time
+?match previous pattern 1 or more times, but with minimal match (aka non-greedy)
boundary anchors
^…Beginning of {line, string, buffer}
…$End of {line, string, buffer}
\`…Beginning of {string, buffer}
…\'End of {string, buffer}
\bword boundary marker

Unicode character can be used literally. But for non-printable ones such as “RIGHT-TO-LEFT MARK”, you can represent them by a code. See: Emacs: Newline Representation ^M ^J ^L

For complete list of regex syntax, see: (info "(elisp) Syntax of Regexps")

Matching Newline and Tab

When using interactive commands, emacs won't understand \n or \t.

(For explanation, see: Emacs's Key Syntax Explained).

Case Sensitivity

When using [a-z], it is not case sensitive by default. Case sensitivity is controlled by the variable case-fold-search. Call toggle-case-fold-search to toggle it.

Do not use [A-z], because that'll match some punctuation chars too. Use [A-Za-z].

Perl Regex vs Emacs Regex

Here are some practical major differences.

perlemacs lisp

For example, Perl's \d+ is emacs's [[:digit:]]+.

Warning: the meaning of a character class in emacs is dependent on the current major mode's syntax table. For example, what chars are considered “word” in [[:word:]] depends on how its defined in syntax table of current major mode.

For a example showing the difference, see: Emacs Lisp: Regex Patterns and Syntax Table.

Syntax table is hard to work with, and regex using it may be unpredictable. Best is just to put the chars you want explicitly in your regex, ➢ for example: [A-Za-z].

Interactive Emacs Regex Mode

Emacs has a interactive regex mode. It show matches as you type. To go into the mode, call regexp-builder. (I don't use this)

Alternatively, call query-replace-regexp to test your pattern. Ι prefer this.

Regex in Emacs Lisp Code

Emacs Lisp: Regex Tutorial


(info "(emacs) Regexps")

Like it? Buy Xah Emacs Tutorial. Thanks.

or, buy something from Best Keyboard for Emacs