How to Write a Emacs Major Mode for Syntax Coloring

By Xah Lee. Date: . Last updated: .

This page shows you how to write a emacs major mode to do syntax coloring of your own language.

emacs mymath major mode
syntax color your own language

Problem

You are writing a major mode for a new language. You want keywords of the language syntax colored.

Suppose your language source code looks like this:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

You want the words “Sin”, “Cos”, “Sum”, colored as functions, and “Pi” and “Infinity” colored as constants.

Solution

Save the following in a file.

;; a simple major mode, mymath-mode

(setq mymath-highlights
      '(("Sin\\|Cos\\|Sum" . font-lock-function-name-face)
        ("Pi\\|Infinity" . font-lock-constant-face)))

(define-derived-mode mymath-mode fundamental-mode "mymath"
  "major mode for editing mymath language code."
  (setq font-lock-defaults '(mymath-highlights)))

Now, copy and paste the above code into a buffer, then Alt+x eval-buffer.

Now, type following code into a buffer:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

Now, M-x mymath-mode, you see words colored.

How Does it Work?

The string "Sin\\|Cos\\|Sum" is a regex, the font-lock-function-name-face is a pre-defined variable that holds the value for the default font and coloring spec used for function keywords.

[see Elisp: Regex Tutorial]

The line define-derived-mode defines your mode, named “mymath-mode”, based on the fundamental-mode. fundamental-mode is the most basic mode.

The line (setq font-lock-defaults '(mymath-highlights)) sets up the syntax highlighting for your mode.

Writing a Mode for a Language that Has Hundreds of Keywords

Typically, a language has hundreds of keywords. Elisp has a way to generate regex for your keywords.

Suppose you are writing a mode for the Linden Scripting Language (LSL). LSL has about 553 keywords. First, here's a sample of LSL source code so you get some idea of how we want it colored.

// sample LSL file

// Examples of variable declaration and assignment:
integer score = 0;
string mySay = "i ♥ you";
vector v = <3,4,5>;
list myList= [2,4,7,3];

// Example of defining a function.
// built-in function's names start with “ll” (Linden Library).
integer sum(integer a, integer b)
{
    integer result = a + b;
    return result;
}

 default
 {
     state_entry()
     {
         llSay(0, mySay);
     }

     touch_start(integer total_number)
     {
         if (score == 1) {
             llSay(0, mySay);
         } else {
             llWhisper(0, "Ouch!");
         }
     }
 }

Each type of keyword uses a different color.

Here's the code.

;;; mylsl-mode.el --- sample major mode for editing LSL. -*- coding: utf-8; lexical-binding: t; -*-

;; Copyright © 2017, by you

;; Author: your name ( your email )
;; Version: 2.0.13
;; Created: 26 Jun 2015
;; Keywords: languages
;; Homepage: http://ergoemacs.org/emacs/elisp_syntax_coloring.html

;; This file is not part of GNU Emacs.

;;; License:

;; You can redistribute this program and/or modify it under the terms of the GNU General Public License version 2.

;;; Commentary:

;; short description here

;; full doc on how to use here

;;; Code:

;; create the list for font-lock.
;; each category of keyword is given a particular face
(setq mylsl-font-lock-keywords
      (let* (
            ;; define several category of keywords
            (x-keywords '("break" "default" "do" "else" "for" "if" "return" "state" "while"))
            (x-types '("float" "integer" "key" "list" "rotation" "string" "vector"))
            (x-constants '("ACTIVE" "AGENT" "ALL_SIDES" "ATTACH_BACK"))
            (x-events '("at_rot_target" "at_target" "attach"))
            (x-functions '("llAbs" "llAcos" "llAddToLandBanList" "llAddToLandPassList"))

            ;; generate regex string for each category of keywords
            (x-keywords-regexp (regexp-opt x-keywords 'words))
            (x-types-regexp (regexp-opt x-types 'words))
            (x-constants-regexp (regexp-opt x-constants 'words))
            (x-events-regexp (regexp-opt x-events 'words))
            (x-functions-regexp (regexp-opt x-functions 'words)))

        `(
          (,x-types-regexp . font-lock-type-face)
          (,x-constants-regexp . font-lock-constant-face)
          (,x-events-regexp . font-lock-builtin-face)
          (,x-functions-regexp . font-lock-function-name-face)
          (,x-keywords-regexp . font-lock-keyword-face)
          ;; note: order above matters, because once colored, that part won't change.
          ;; in general, put longer words first
          )))

;;;###autoload
(define-derived-mode mylsl-mode c-mode "lsl mode"
  "Major mode for editing LSL (Linden Scripting Language)…"

  ;; code for syntax highlighting
  (setq font-lock-defaults '((mylsl-font-lock-keywords))))

;; add the mode to the `features' list
(provide 'mylsl-mode)

;;; mylsl-mode.el ends here

Note that the highlighting mechanism of font-lock-defaults is based on first-come-first-serve basis. Once a sequence of characters is colored, it won't be changed. So, the order of your list is important. In general, put longer length keywords first. (this won't fix all cases where a keyword matches part of other keywords. If your language has a lot such keywords, you need to use other forms to solve this problem. (info "(elisp) Search-based Fontification"))

The `( ,a ,b …) is a lisp special syntax to evaluate parts of elements inside the list. Inside the paren, elements preceded by a , will be evaluated.

In the above, we based our mode on c-mode, because the syntax is similar. Basing on a similar language's mode will save you time in coding many features, such as handling comment and indentation.

The line:

(provide 'mylsl-mode)

adds the symbol mylsl-mode to the variable features list. [see Elisp: provide, require, features]

Now, to run the code, Alt+x eval-buffer. [see Evaluate Emacs Lisp Code]

Open the LSL language sample file given above, then Alt+x mylsl-mode. Here's the result:

emacs sample mylsl-mode
sample mylsl-mode syntax highlighting result.

Complex Syntax Coloring

For many language, the syntax coloring are not fixed set of strings. For example, in XML, you have <xyz>…</xyz> pattern where the “xyz” can be anything.

emacs html-mode syntax coloring screenshot 2013-07-31
emacs html-mode syntax coloring screenshot

Font Lock Mode Basics

To handle more complex syntax coloring, continue to

Elisp: Font Lock Mode Basics

Elisp, Writing Major Mode

  1. Syntax Coloring
  2. Font Lock Mode
  3. Define Face
  4. Color Comment
  5. Comment Command
  6. Your Own Comment Command
  7. Keyword Completion Command
  8. Create Keymap
  9. Abbrev/Template
  10. Text Properties
  11. Overlay Highlighting
  12. Lookup Doc
  13. Syntax Table

  1. How to Name Your Major Mode
  2. provide, require, features
  3. load, load-file, autoload

If you have a question, put $5 at patreon and message me.
Or Buy Xah Emacs Tutorial
Or buy a nice keyboard: Best Keyboards for Emacs

Emacs

Emacs Lisp

Misc