Emacs Lisp: Automatic Code Formatting, Auto Indentation

, , …,

for lisp languages, it would be nice, if a programer can press a button in emacs, then the current code block would be formatted by a simple lexical analysis. (similar to how fill-paragraph would work)

I think it is not hard to write this command, but to my surprise, it is not done. I was told by one Scheme expert Taylor R Campbell (aka Riastradh, author of paredit mode) that this is non-trivial, but i couldn't believe it and maybe he misunderstood what i wanted about this command.

here's a outline how this would work.

Simply count the levels of nesting of parens. For example, consider this lisp code:

(defun previous-user-buffer ()
  "Switch to the next user buffer in cyclic order."
  (interactive)
  (previous-buffer)
  (let ((i 0))
    (while (and (string-match "^*" (buffer-name)) (< i 10))
      (setq i (1+ i)) (previous-buffer) )))

each left paren has a level of nesting. Say, n=0, n=1, n=2…etc. A simplest version of auto-format is to start a new line for each left paren, with n being the number of indent. So, the above code would be formatted like this (using 1 space for indent in this example):

(defun previous-user-buffer
 () "Switch to the next user buffer in cyclic order."
 (interactive)
  (previous-buffer)
  (let
   (
    (i 0))
 (while
  (and
   (string-match "^*"
    (buffer-name))
  (< i 10))
  (setq i
   (1+ i))
   (previous-buffer))))

Now, this is probably too many short lines when compared to how lisp code is traditionally formatted. We can modify the auto-format heuristics to reduce short lines: if a complete unit of expression is less than 70 char, then render the whole expression in one line.

Here's how the code would look with this rule:

;23456789 123456789 123456789 123456789 123456789 123456789 123456789

(defun previous-user-buffer
 () "Switch to the next user buffer in cyclic order."
 (interactive)
 (previous-buffer)
 (let
  ((i 0))
  (while
   (and
    (string-match "^*" (buffer-name))
    (< i 10))
    (setq i (1+ i))
    (previous-buffer))))

… looks much better. I don't know how well this would work out for more complex code… but i think idea is there. Adding to the heuristics might be special rules dealing with the doc string and other special non-regular lisp syntaxes (such as those involving special chars “ ' ` # ,@ . ;”, etc). (However, such special rule should be kept as minimal as possible)

On the whole, a simple formatting by lexical analysis in not going to be as pretty as manual formatting. However, it is my opinion, if lispers adapts to such a uniform, simple, machine-produced auto-formatting, the impact on lisp community considered as whole, will be tremendous. It would get rid of the “source code formatting style” literature and debates for good, because all coders will be accustomed to this machine-produced, uniform, style, when they begin to learn lisp. (each coder can set some personal preferences to the auto-formatter if she so wishes, and re-format entire source code on the fly) Once a language's source code are presented in a uniform style universally, it would fundamentally influence the idioms and program constructs lisp coders actually produce. (this is a advantage the Python language offers transparently.)

See also the section “Automatic, Uniform, Universal, Source Code Display” at Fundamental Problems of Lisp, for the relation between a regular syntax and source code formatting.

Some related links:

blog comments powered by Disqus