Emacs Lisp: Implementing Comment Handling in a Major Mode

Buy Xah Emacs Tutorial. Master emacs benefits for life.
, , …,

This page gives a practical example of writing a emacs major mode that handles programing language comments. ① the syntax coloring of comment. ② a command to comment/uncomment code.

For a intro on writing a major mode, first see: How to Write a Emacs Major Mode for Syntax Coloring.

Problem Description

You want to write a command to comment or uncomment code in your own major mode.

Detail

Emacs has a standard command to insert or delete comment, named comment-dwimAlt+;】. When there is a text selection, comment-dwim will comment or uncomment the region in a smart way. If there is no text selection, comment-dwim will insert a comment syntax at the end of line.

The comment-dwim is the standard command to comment/uncomment code. Your major mode should support it. When a user types the keyboard shortcut for comment-dwim while in your language mode, he should expect it to work for that language.

Solution

Sample Source Code

Let's say our new language is called “xyz”, and the following is sample source code for your lang. First, save it to a file, name it test.xyz.

-*- mode: xyz -*-

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

# Perl, Python, Bash, PHP

// C, C++, Java, JavaScript, PHP

/* C, C++, Java, JavaScript, PHP */

(* Applescript, Mathematica, Pascal, OCaml *)

The various comment texts above is there for testing purposes. In later part of this tutorial, we'll modify our code to handle each comment syntax.

The first line -*- mode: xyz -*- is a quick way to tell emacs to load xyz-mode when the file is opened. This way, we don't have to manually call xyz-mode each time we open this test file.

Major Mode Template

One type of comment syntax starts with a comment char and end with a newline char. For example, Perl, Python, Bash, all uses “#” for starting comment. Here's a major mode template to handle this comment syntax:

;; command to comment/uncomment text
(defun xyz-comment-dwim (arg)
  "Comment or uncomment current line or region in a smart way.
For detail, see `comment-dwim'."
  (interactive "*P")
  (require 'newcomment)
  (let (
        (comment-start "#") (comment-end "")
        )
    (comment-dwim arg)))

;; keywords for syntax coloring
(setq xyz-keywords
      `(
        ( ,(regexp-opt '("Sin" "Cos" "Sum") 'word) . font-lock-function-name-face)
        ( ,(regexp-opt '("Pi" "Infinity") 'word) . font-lock-constant-face)
        )
      )

;; syntax table
(defvar xyz-syntax-table nil "Syntax table for `xyz-mode'.")
(setq xyz-syntax-table
      (let ((synTable (make-syntax-table)))

        ;; bash style comment: “# …” 
        (modify-syntax-entry ?# "< b" synTable)
        (modify-syntax-entry ?\n "> b" synTable)

        synTable))

;; define the major mode.
(define-derived-mode xyz-mode fundamental-mode
  "xyz-mode is a major mode for editing language xyz."
  :syntax-table xyz-syntax-table
  
  (setq font-lock-defaults '(xyz-keywords))
  (setq mode-name "xyz")

  ;; modify the keymap
  (define-key xyz-mode-map [remap comment-dwim] 'xyz-comment-dwim)
)

Save the above in a file and name it xyz-mode.el, and load the file by calling the command load-file.

The xyz-comment-dwim is our command to comment and uncomment code. The implementation is based on the newcomment.el's infrastructure. The newcomment.el is bundled with emacs and is probably used by most language's modes. It is a good idea to based on it instead of writing your own.

The line define-key defines a keyboard shortcut for invoking xyz-comment-dwim. The variable for keymap named “xyz-mode-map” is automatically created for you when you called define-derived-mode with first argument being xyz-mode. That's why we don't need to create and define it, we simply start to call define-key to modify its content.

The [remap comment-dwim] is a special syntax of define-key to tell emacs to use the same key that is currently bound to comment-dwim. This way, we make sure the key stays the same as comment-dwim even if user may have changed it to some other key.

We create a syntax table “xyz-syntax-table” for our mode. The modify-syntax-entry lines are to make sure that “#” and “\n” chars's syntax are that of starting and ending of comment. Once comment chars have the correct syntax table entry, comments are automatically syntax colored. (that is, you don't need to write other code to syntax color comments.)

(info "(elisp) Syntax Tables")

Testing

Now, open the sample source code file test.xyz, call xyz-mode, and the code will be highlighted like this:

Sin[x]^2 + Cos[y]^2 == 1
Pi^2/6 == Sum[1/x^2,{x,1,Infinity}]

# perl, python, bash

Now, call xyz-comment-dwim, you'll see that it works. Also try it on a text selection.

C++ Style Comments

To do the C++ style comments // …, you will need to change the syntax entry lines as follows:

;; C++ style comment “// …” 
  (modify-syntax-entry ?\/ ". 12b" synTable)
  (modify-syntax-entry ?\n "> b" synTable)

and also change the code in xyz-comment-dwim to:

(comment-start "//") (comment-end "")

Java Style Comments

To do the Java style comments /* … */, you will need to change the syntax entry lines as follows:

;; define comment for this style: “/* … */” 
  (modify-syntax-entry ?\/ ". 14" synTable)
  (modify-syntax-entry ?* ". 23" synTable)

and also change the code in xyz-comment-dwim to:

(comment-start "/*") (comment-end "*/")

Mathematica Style Comments

To do the Mathematica or Pascal style comments (* … *), you will need to change the syntax entry lines as follows:

;; Mathematica style comment: “(* … *)” 
  (modify-syntax-entry ?\( ". 1" synTable)
  (modify-syntax-entry ?\) ". 4" synTable)
  (modify-syntax-entry ?* ". 23" synTable)

and also change the code in xyz-comment-dwim to:

(comment-start "(*") (comment-end "*)")

Summary

There are 2 issues with comments. ① syntax coloring of comments. ② command that comment/uncomment code.

For syntax coloring:

For comment command:

If you don't like the behavior of comment-dwim, or you don't want your comment command based on it, you can write your own. See: Comment Command from Scratch.

Complex Comment Syntax

Emacs's syntax table only support comment syntax that are used in mainstream languages. Here are the comment syntax types supported by emacs's syntax table:

Emacs Syntax Table Support of Comment Syntax Types
ExampleSyntax Type
# …\n (Python, Perl, PHP, Bash, shells)
; …\n (lisp)
' …\n (Visual Basic)
Start with a char to newline char.
// … \n (C, C++, C#, Java, JavaScript, PHP)Start with 2 identical chars to newline char.
(* … *) (Mathematica, Pascal, OCaml, Applescript)
{- … -} (Haskell)
A matching pair chars with another char.
/* … */ (C, C++, C#, Java, JavaScript)Two chars used in a ad hoc way as matching pair.

If your language's comment syntax is not one of the above, then emacs syntax table is not able to capture it. You need to use emacs syntax coloring mechanisms to color comment like any other syntax. 〔➤ How to Write a Emacs Major Mode for Syntax Coloring〕 You also need to write your own command to comment/uncomment code. 〔➤ Comment Command from Scratch

thanks to Daniel for correction.

Like it?
Buy Xah Emacs Tutorial
or share
blog comments powered by Disqus