How to Write a Emacs Major Mode for Syntax Coloring

By Xah Lee. Date: . Last updated: .

This page shows you how to write a emacs major mode to do syntax coloring of your own language.

emacs mymath major mode
syntax color your own language

Problem

You are writing a major mode for a new language. You want keywords of the language syntax colored.

Suppose your language source code looks like this:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

You want the words “Sin”, “Cos”, “Sum”, colored as functions, and “Pi” and “Infinity” colored as constants.

Solution

Save the following in a file.

;; a simple major mode, mymath-mode

(setq mymath-highlights
      '(("Sin\\|Cos\\|Sum" . font-lock-function-name-face)
        ("Pi\\|Infinity" . font-lock-constant-face)))

(define-derived-mode mymath-mode fundamental-mode "mymath"
  "major mode for editing mymath language code."
  (setq font-lock-defaults '(mymath-highlights)))

Now, copy and paste the above code into a buffer, then Alt+x eval-buffer.

Now, type following code into a buffer:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

Now, M-x mymath-mode, you see words colored.

How Does it Work?

The string "Sin\\|Cos\\|Sum" is a regex, the font-lock-function-name-face is a pre-defined variable that holds the value for the default font and coloring spec used for function keywords.

The line define-derived-mode defines your mode, named “mymath-mode”, based on the fundamental-mode. fundamental-mode is the most basic mode.

The line (setq font-lock-defaults '(mymath-highlights)) sets up the syntax highlighting for your mode.

Here's another simple example: Elisp: html6-mode.

〔►see Elisp: Regex Tutorial

Writing a Mode for a Language that Has Hundreds of Keywords

Typically, a language has hundreds of keywords. Elisp has a way to generate regex for your keywords.

Suppose you are writing a mode for the Linden Scripting Language (LSL). LSL has about 553 keywords. First, here's a sample of LSL source code so you get some idea of how we want it colored.

// sample LSL file

// Examples of variable declaration and assignment:
integer score = 0;
string mySay = "i ♥ you";
vector v = <3,4,5>;
list myList= [2,4,7,3];

// Example of defining a function.
// built-in function's names start with “ll” (Linden Library).
integer sum(integer a, integer b)
{
    integer result = a + b;
    return result;
}

 default
 {
     state_entry()
     {
         llSay(0, mySay);
     }

     touch_start(integer total_number)
     {
         if (score == 1) {
             llSay(0, mySay);
         } else {
             llWhisper(0, "Ouch!");
         }
     }
 }

Each type of keyword uses a different color.

Here's the code.

;;; mylsl-mode.el --- sample major mode for editing LSL. -*- coding: utf-8; lexical-binding: t; -*-

;; Copyright © 2015, by you

;; Author: your name ( your email )
;; Version: 2.0.13
;; Created: 26 Jun 2015
;; Keywords: languages
;; Homepage: http://ergoemacs.org/emacs/elisp_syntax_coloring.html

;; This file is not part of GNU Emacs.

;;; License:

;; You can redistribute this program and/or modify it under the terms of the GNU General Public License version 2.

;;; Commentary:

;; short description here

;; full doc on how to use here


;;; Code:

;; define several category of keywords
(setq mylsl-keywords '("break" "default" "do" "else" "for" "if" "return" "state" "while") )
(setq mylsl-types '("float" "integer" "key" "list" "rotation" "string" "vector"))
(setq mylsl-constants '("ACTIVE" "AGENT" "ALL_SIDES" "ATTACH_BACK"))
(setq mylsl-events '("at_rot_target" "at_target" "attach"))
(setq mylsl-functions '("llAbs" "llAcos" "llAddToLandBanList" "llAddToLandPassList"))

;; generate regex string for each category of keywords
(setq mylsl-keywords-regexp (regexp-opt mylsl-keywords 'words))
(setq mylsl-type-regexp (regexp-opt mylsl-types 'words))
(setq mylsl-constant-regexp (regexp-opt mylsl-constants 'words))
(setq mylsl-event-regexp (regexp-opt mylsl-events 'words))
(setq mylsl-functions-regexp (regexp-opt mylsl-functions 'words))

;; create the list for font-lock.
;; each category of keyword is given a particular face
(setq mylsl-font-lock-keywords
      `(
        (,mylsl-type-regexp . font-lock-type-face)
        (,mylsl-constant-regexp . font-lock-constant-face)
        (,mylsl-event-regexp . font-lock-builtin-face)
        (,mylsl-functions-regexp . font-lock-function-name-face)
        (,mylsl-keywords-regexp . font-lock-keyword-face)
        ;; note: order above matters, because once colored, that part won't change.
        ;; in general, longer words first
        ))

;;;###autoload
(define-derived-mode mylsl-mode c-mode "lsl mode"
  "Major mode for editing LSL (Linden Scripting Language)…"

  ;; code for syntax highlighting
  (setq font-lock-defaults '((mylsl-font-lock-keywords))))

;; clear memory. no longer needed
(setq mylsl-keywords nil)
(setq mylsl-types nil)
(setq mylsl-constants nil)
(setq mylsl-events nil)
(setq mylsl-functions nil)

;; clear memory. no longer needed
(setq mylsl-keywords-regexp nil)
(setq mylsl-types-regexp nil)
(setq mylsl-constants-regexp nil)
(setq mylsl-events-regexp nil)
(setq mylsl-functions-regexp nil)

;; add the mode to the `features' list
(provide 'mylsl-mode)

;;; mylsl-mode.el ends here

Note that the highlighting mechanism of font-lock-defaults is based on first-come-first-serve basis. Once a sequence of characters is colored, it won't be changed. So, the order of your list is important. In general, put longer length keywords first. (this won't fix all cases where a keyword matches part of other keywords. If your language has a lot such keywords, you need to use other forms to solve this problem. (info "(elisp) Search-based Fontification"))

The `( ,a ,b …) is a lisp special syntax to evaluate parts of elements inside the list. Inside the paren, elements preceded by a , will be evaluated.

In the above, we based our mode on c-mode, because the syntax is similar. Basing on a similar language's mode will save you time in coding many features, such as handling comment and indentation.

The line:

(provide 'mylsl-mode)

adds the symbol mylsl-mode to the variable features list. 〔►see Elisp: What's “feature”?

Now, to run the code, Alt+x eval-buffer. 〔►see How to Evaluate Emacs Lisp Code

Open the LSL language sample file given above, then Alt+x mylsl-mode. Here's the result:

emacs sample mylsl-mode
sample mylsl-mode syntax highlighting result.

How to Name Your Major Mode

Elisp: How to Name Your Major Mode

(info "(elisp) Major Mode Conventions")

Writing Major Mode

  1. How to Write a Emacs Major Mode for Syntax Coloring
  2. Elisp: html6-mode
  3. Elisp: Font Lock Mode Basics
  4. Elisp: How to Define Face
  5. Elisp: How to Color Comment in Major Mode
  6. Elisp: How to Write Comment Command in Major Mode
  7. Elisp: How to Write Your Own Comment Command from Scratch
  8. Elisp: How to Write Keyword Completion Command
  9. Elisp: How to Create Keymap for Major Mode
  10. Elisp: Create Abbrev and Templates for Major Mode
  11. Elisp: Text Properties
  12. Elisp: Overlay Highlighting
  13. Emacs: Lookup Google, Dictionary, Documentation
  14. Elisp: Syntax Table Tutorial

  1. Elisp: How to Name Your Major Mode
  2. Elisp: What's “feature”?
  3. Elisp: require, load, load-file, autoload, feature, Explained

Syntax Table

  1. Elisp: Syntax Table Tutorial
  2. Elisp: How to Find Syntax of a Character?
  3. Elisp: How to Modify Syntax Table Temporarily
  4. Elisp: How to Determine If Cursor is Inside String or Comment
  5. Elisp: Regex Patterns and Syntax Table
  6. Elisp: Find Matching Bracket Character
Liket it? Put $5 at patreon. Or Buy Xah Emacs Tutorial. Thanks.