Emacs Lisp: Syntax Table Tutorial

By Xah Lee. Date: . Last updated: .

Emacs has a concept of Syntax Table. The basic idea is, each character (every Unicode character), is categorized into a class. For example, letters, punctuations, brackets, char for programing language identifiers, comment character, string delimiters, etc.

Syntax table is heavily used in emacs. For example,

  1. Most cursor movement commands rely on syntax table. For example, when you call forward-wordAlt+f】, emacs will move cursor until it reaches a character that's not in the “word” class.
  2. Syntax coloring of strings and comments rely on syntax table. 〔►see Emacs Lisp: How to Color Comment in Major Mode
  3. Programing language comment/uncomment command also relies on syntax table.
  4. Lisp mode's parenthesis navigation also depends on syntax table. 〔►see Emacs: Navigate Lisp Code as Tree

Each buffer has its own version of syntax table. Typically, when a major mode is activated, it changes the current buffer's syntax table.

View Current Syntax Table

emacs describe syntax 2017 02 13
Emacs's describe-syntax output. Left is char. Middle is the char's syntax class abbreviation.

Call describe-syntax to show current buffer's syntax table.

Syntax Classes

Each syntax class is identified by a 1-char code.

Here's the most important character classes and their 1-char code.

Most Important Syntax Classes
-whitespace character.
wword. (typically the alphabets A to Z, and other languages's letters and Chinese characters.)
_symbol. (char of symbol class, plus chars of “word” class, together is programing language “identifier” chars.)
"string delimiter.
(left bracket.
)right bracket.
<comment start.
>comment end.
\escape character.

For complete list, see (info "(elisp) Syntax Class Table")

Syntax Descriptor

A syntax descriptor is a lisp string that specifies a syntax class (the 1-char code), a matching character (used only for characters in the bracket class) and flags (used for comment delimiters).

(info "(elisp) Syntax Descriptors")

You'll use Syntax Descriptor with the function modify-syntax-entry, when you create a syntax table. Here's a quick example.

;; set period to have syntax class of symbol
(modify-syntax-entry ?\. "_" my-syn-table)

;; set the «French double quote» to be brackets
(modify-syntax-entry ?\« "(»" my-syn-table)
(modify-syntax-entry ?\» ")«" my-syn-table)

;; set char's syntax for C++ style comment “// …”
(modify-syntax-entry ?\/ ". 12b" my-syn-table)
(modify-syntax-entry ?\n "> b" my-syn-table)

Character Datatype in Elisp

Character datatype in elisp is integer of the character's Unicode code point. For example, the char a in elisp is just 61, because its code point is 61 (in decimal).

Char can also be represented by the syntax, for example, ?a for easy reading. ?a means the character a.

Char can also be represented by the syntax, for example, ?\a. The backslash is a escape character. For example, the backslash char would be ?\\.

You can also represent char by (string-to-char "a")

To find a char's code point, call describe-char.

(info "(elisp) Basic Char Syntax")

Now let's look at the line

(modify-syntax-entry ?\. "_" my-syn-table)

modify-syntax-entry take 3 args.

  1. A character.
  2. A syntax descriptor string.
  3. A variable of syntax table.

?\. is the character period ..

"_" is the syntax descriptor string. It means the period character is in symbol class.

Now let's look at this line:

(modify-syntax-entry ?\« "(»" my-syn-table)

It means, the character « is in the class of opening bracket, and its matching character is ».

Now, this line:

(modify-syntax-entry ?\/ ". 12b" my-syn-table)

Let's look at the syntax descriptor string ". 12b". It means:

The flags in syntax descriptor is very complex. See elisp manual (info "(elisp) Syntax Flags")

For practical examples of using the syntax flags for comments, see Emacs Lisp: How to Color Comment in Major Mode

Basic Facts of Syntax Table

  1. Syntax table is a lookup table, and is implemented as a special vector. You use make-syntax-table and others to create it.
  2. Each buffer has its own syntax table. (so, it's like a buffer-local variable, but there's no variable.)
  3. Use set-syntax-table to set a syntax table for current buffer.

Create Syntax Table

Here's typical way to create a syntax table.

;; typical way to create and set syntax table

(defvar xpy-mode-syntax-table nil "Syntax table for `xpy-mode'.")

(setq xpy-mode-syntax-table
      (let ( (synTable (make-syntax-table)))

        ;; set/modify each char's class
        (modify-syntax-entry ?# "<" synTable)
        (modify-syntax-entry ?\n ">" synTable)
        ;; more lines here ...

        ;; return it

;; then, have this line inside your mode definition. So that, when user calls your major mode, it will set syntax table for whatever is the current buffer of user
(set-syntax-table xpy-mode-syntax-table)

Standard Syntax Table

The function standard-syntax-table returns the standard syntax table.

Standard syntax table is the syntax table used by fundamental-mode.

Every syntax table is derived from standard syntax table.

Syntax Table Inheritance

Emacs syntax table has inheritance. That is, each syntax table you create inherits a parent syntax table. You do not need to set every character's syntax class. When a syntax table does not have entry for a character, it uses the parent table.

Syntax table commands has optional parameter for a table name of a parent table. When not specified, the parent syntax table is a standard syntax table.

(info "(elisp) Syntax Table Functions")

Syntax Table Topic

  1. Emacs Lisp: Syntax Table Tutorial
  2. Emacs Lisp: How to Find Syntax of a Character?
  3. Emacs Lisp: How to Modify Syntax Table Temporarily
  4. Emacs Lisp: How to Determine If Cursor is Inside String or Comment
  5. Emacs Lisp: Find Matching Bracket Character
Like it? Buy Xah Emacs Tutorial. Thanks.

or, buy something from Best Keyboard for Emacs