Elisp: Regex Tutorial

By Xah Lee. Date: . Last updated: .

This page is a tutorial on using regex in emacs lisp code.

your regex brain
emacs lisp regex toothpick syndrome

Regex Syntax

If you are not familiar with emacs regex syntax, first see:

Emacs: Regex Tutorial

Test Regex in Elisp Code

One simple way to test regex is to create a file with the following content:

(re-search-forward "yourRegex")

whatever text to search here

Then, put your cursor to the right of the closing parenthesis, then Alt+x eval-last-sexpCtrl+x Ctrl+e】. If your regex matches, it'll move cursor to the last char of the matched text. If you get a lisp error saying search failed, then your regex didn't match. If you get a lisp syntax error, then you probably screwed up on the backslashs.

Newline Character and Tab

Inside elisp string, \t is TAB char (Unicode codepoint 9), and \n is newline. You can use [\t\n ]+ for sequence of {tab, newline, space}.

When a file is opened in Emacs, newline is always \n, regardless whether your file is from {Unix, Windows, Mac}. Do NOT manually do find replace on newline chars for changing file newline convention. [see Emacs: Newline Representations ^M ^J ^L]

Double Backslash in Lisp Code

Regex string in emacs lisp needs to have lots double backslash.

First of all, remember these:

then, any other backslash needs to double.

Example,

this \( a.d\) for capturing group for words like “and”, “add”, “aid”.

becomes "\\( a.d\\)"

And, literal interpretation instead of regex (such as square bracket), you need double backslashs in front, because double backslashs in string represents 1 backslash.

Example:

To match any lowercase English letter, do "[a-z]"

But to match square bracket literally, you need: "\\[citation\\]"

Here's bigger example, suppose you have this text:

<img src="cat.jpg" alt="my cat" width="795" height="183" />

When you call a command such as list-matching-lines , you can type the regex in the prompt. Example:

<img src="\([^"]+?\)" alt="\([^"]+?\)" width="\([0-9]+\)" height="\([0-9]+\)" />

But in lisp code, the same regex needs to have many backslash escapes, like this:

(re-search-forward
"<img src=\"\\([^\"]+?\\)\" alt=\"\\([^\"]+?\\)\" width=\"\\([0-9]+\\)\" height=\"\\([0-9]+\\)\" />" )

(info "(elisp) Regular Expressions")

Use emacs to Convert Regex to Elisp Regex String

There is a easy way to get the backslashes right.

  1. First, Alt+x list-matching-lines to do what you want.
  2. Immediately call repeat-complex-command. The elisp regex syntax will be shown in minibuffer. (with all correct backslashes if needed)
elisp regex 2019-09-07 39c37
Alt+x list-matching-lines , followed by Alt+x repeat-complex-command

Unicode Representation in String

Elisp: Unicode Escape Sequence

Find Replace Text

Elisp: Find Replace String in Buffer

Regex in Elisp Syntax: rx Package

There is a elisp package rx that uses lisp style syntax to represent regex syntax.

(require 'rx)

;; this
(rx (one-or-more blank) line-end)

;; returns this
;; "[[:blank:]]+$"

I do not recommend it. Because it's a middleman. Just learn and use raw regex directly.

Regex and Syntax Table

Warning: the meaning of a character class in emacs is dependent on the current major mode's syntax table. For example, what chars are considered “word” in [[:word:]] depends on how its defined in syntax table of current major mode.

For a example showing the difference, see: Elisp: Regex Patterns and Syntax Table

Syntax table is hard to work with, and regex using it may be unpredictable. Best is just to put the chars you want explicitly in your regex, for example, [A-Za-z].

Elisp Regex Video Tutorial

unicode inverted text, emacs regex and in emacs lisp 2019-09-06

Elisp: Writing Command

  1. Writing Command, Basics
  2. Mark and Region
  3. Get Buffer String
  4. Work with Lines
  5. Copy Cut Paste kill-ring
  6. Get User Input
  7. Interactive Form
  8. Get universal-argument
  9. Find Replace Text
  10. thing-at-point
  11. Get Dired Marked Files

If you have a question, put $5 at patreon and message me.
Or Buy Xah Emacs Tutorial
Or buy a nice keyboard: Best Keyboards for Emacs

Emacs

Emacs Lisp

Misc