Emacs Lisp: a Function That Works on String or Region

By Xah Lee. Date:

This article shows you how write a elisp text-transformation function that can be used in 2 ways: ① change text in a buffer region. ② takes a string argument and returns a string.

Emacs lisp level: advanced.

Problem

For a function that transform text, find a way to code it so that:

For example, suppose you have a command remove-vowel that works on a region, but you also want a version “remove-vowel-in-string” which just takes a string input and returns a string. The string version is very convenient in lisp code. But i don't want to keep 2 functions. I want just one single function.

Detail

Been coding elisp for 5 years now, perhaps about 2 hours a day. I have ~30 commands that do text transformation on text under cursor. For examples:

In the past year, i find that i often need 2 versions of a function. One version for working in a buffer, while another version simply work on string. The string version is very convenient and simple when used in elisp code.

This is becoming a problem, because for every text processing function i seem to need to write and maintain 2 versions. For example, let's say i have a function named remove-vowel that changes “something” to “smthng”. Typically, i'd write a “remove-vowel-in-string” that takes a string as argument and output a string. Then i write another version remove-vowel that is a interface wrapper, and calls “remove-vowel-in-string” to do the actual work.

Having 2 versions of every function is becoming annoying. So, today i thought about it and came up with a solution.

Solution

The solution is this: The function would take 1 argument, and 2 more optional arguments, like tis:

(defun remove-vowel (ξstring &optional ξfrom ξto) …)

When remove-vowel is called interactively, simply feed the function {nil, ξfrom, ξto}.

This way, the function can be used as a string manipulation function, or it can be used as a buffer text changing function, with no penalties or inefficiencies i can think of. Here's how it's done using remove-vowel as example:

(defun remove-vowel (ξstring &optional ξfrom ξto)
  "Remove the following letters: {a e i o u}.

When called interactively, work on current paragraph or text selection.

When called in lisp code, if ξstring is non-nil, returns a changed string.
If ξstring nil, change the text in the region between positions ξfrom ξto."
  (interactive
   (if (use-region-p)
       (list nil (region-beginning) (region-end))
     (let ((bds (bounds-of-thing-at-point 'paragraph)) )
       (list nil (car bds) (cdr bds)) ) ) )

  (let (workOnStringP inputStr outputStr)
    (setq workOnStringP (if ξstring t nil))
    (setq inputStr (if workOnStringP ξstring (buffer-substring-no-properties ξfrom ξto)))
    (setq outputStr
          (let ((case-fold-search t))
            (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) )  )

    (if workOnStringP
        outputStr
      (save-excursion
        (delete-region ξfrom ξto)
        (goto-char ξfrom)
        (insert outputStr) )) ) )

The meat of this function is just (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr). But let's see how the input/output is done.

Use of (interactive)

The interactive is a declaration that lets emacs know how arguments are passed to the function when it is used interactively. For example, it can be user input from a prompt in minibuffer, or from universal-argumentCtrl+u】. Or, how to interpret the input, as a string, number, a buffer name, file name, etc.

When a function has (interactive) (usually placed right after the doc string), it means the function is a command (i.e. it can be called by execute-extended-commandAlt+x】).

When a function has (interactive "r"), then emacs will take the {beginning, ending} cursor positions of a region and feed it to the function as the first 2 arguments. The "r" is called the “interactive code”. See: (info "(elisp) Interactive Codes").

Normally, the argument to interactive is a string, but it can be other lisp expression. When it is a lisp expression, the return value of the expression must be a list, and the items are feed to the function as arguments.

So, in our case of remove-vowel, our argument to interactive is a lisp expression that return a list of 3 items. Like this:

(defun remove-vowel (ξstring &optional ξfrom ξto)
 "…"
 (interactive
    (if (use-region-p)
        (list nil (region-beginning) (region-end))
      (let ((bds (bounds-of-thing-at-point 'paragraph)) )
        (list nil (car bds) (cdr bds)) ) ) )
…
)

If there's a text selection (region is active), it sets “ξstring” to nil and {ξfrom, ξto} to region {begin, end} positions.

If there's no text selection (region is not active), it sets “ξstring” to nil and {ξfrom, ξto} to paragraph's {begin, end} positions.

In both cases, the “ξstring” is set to nil, so the function will work on the region text.

(See: Using thing-at-pointWhat's Region, Active Region, transient-mark-mode?)

Rest of Code

The above takes care of interactive use of the function.

Now, remember that our function takes 3 arguments: {ξstring, ξfrom, ξto}. The {ξfrom, ξto} are optional. When “ξstring” is given (i.e. not nil), the function will take that as input and return a string. Otherwise, it takes {ξfrom, ξto} as region positions and transform text in the buffer.

For clarity, first we set “workOnStringP”:

(setq workOnStringP (if ξstring t nil))

then we set the “inputStr” like this:

(setq inputStr (if workOnStringP ξstring (buffer-substring-no-properties ξfrom ξto)))

Now, it works on the string, like this:

(setq outputStr
 (let ((case-fold-search t))
  (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) )

Then, it either returns the outputStr or just change the region in buffer, depending whether “workOnStringP” is true, like this:

(if workOnStringP
        outputStr
      (save-excursion
        (delete-region ξfrom ξto)
        (goto-char ξfrom)
        (insert outputStr) ))

The weird ξ you see in my elisp code is Greek x. I use Unicode char in symbol name for easy distinction from builtin symbols. You can just ignore it. 〔►see Programing Style: Variable Naming: English Words Considered Harmful

Emacs ♥

Like it? Buy Xah Emacs Tutorial. Thanks.