This article shows you how write a elisp text-transformation function that can be used in 2 ways: ① change text in a buffer region. ② takes a string argument and returns a string.
Emacs lisp level: advanced.
For a function that transform text, find a way to code it so that:
For example, suppose you have a command remove-vowel that works on a region, but you also want a version “remove-vowel-in-string” which just takes a string input and returns a string. The string version is very convenient in lisp code. But i don't want to keep 2 functions. I want just one single function.
Been coding elisp for 5 years now, perhaps about 2 hours a day. I have ≈30 commands that do text transformation on text under cursor. For examples: changing URL into a HTML link ◆ changing the filename under cursor into a HTML image link ◆ asciify-string ◆ transform date format ◆ changing a region into a standard citation format ◆ compact-css-region ◆ change source code text to syntax colored HTML, … etc.
In the past year, i find that i often need 2 versions of a function. One version for working in a buffer, while another version simply work on string. The string version is very convenient and simple when used in elisp code.
This is becoming a problem, because for every text processing function i seem to need to write and maintain 2 versions. For example, let's say i have a function named remove-vowel that changes “something” to “smthng”. Typically, i'd write a “remove-vowel-in-string” that takes a string as argument and output a string. Then i write another version remove-vowel that is a interface wrapper, and calls “remove-vowel-in-string” to do the actual work.
Having 2 versions of every function is becoming annoying. So, today i thought about it and came up with a solution.
The solution is this: The function would take 1 argument, and 2 more optional arguments, like tis:
(defun remove-vowel (ξstring &optional ξfrom ξto) …)
When remove-vowel is called interactively, simply feed the function {nil, ξfrom, ξto}.
This way, the function can be used as a string manipulation function, or it can be used as a buffer text changing function, with no penalties or inefficiencies i can think of. Here's how it's done using remove-vowel as example:
(defun remove-vowel (ξstring &optional ξfrom ξto) "Remove the following letters: {a e i o u}. When called interactively, work on current paragraph or text selection. When called in lisp code, if ξstring is non-nil, returns a changed string. If ξstring nil, change the text in the region between positions ξfrom ξto." (interactive (if (region-active-p) (list nil (region-beginning) (region-end)) (let ((bds (bounds-of-thing-at-point 'paragraph)) ) (list nil (car bds) (cdr bds)) ) ) ) (let (workOnStringP inputStr outputStr) (setq workOnStringP (if ξstring t nil)) (setq inputStr (if workOnStringP ξstring (buffer-substring-no-properties ξfrom ξto))) (setq outputStr (let ((case-fold-search t)) (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) ) (if workOnStringP outputStr (save-excursion (delete-region ξfrom ξto) (goto-char ξfrom) (insert outputStr) )) ) )
The meat of this function is just
(replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr).
But let's see how the input/output is done.
The interactive is a declaration that lets emacs know how arguments are passed to the function when it is used interactively. For example, it can be user input from a prompt in minibuffer, or from universal-argument 【Ctrl+u】. Or, how to interpret the input, as a string, number, a buffer name, file name, etc.
When a function has (interactive) (usually placed right after the doc string), it means the function is a command (i.e. it can be called by execute-extended-command 【Alt+x】).
When a function has (interactive "r"), then emacs will take the {beginning, ending} cursor positions of a region and feed it to the function as the first 2 arguments. The "r" is called the “interactive code”.
See: (info "(elisp) Interactive Codes").
Normally, the argument to interactive is a string, but it can be other lisp expression. When it is a lisp expression, the return value of the expression must be a list, and the items are feed to the function as arguments.
So, in our case of remove-vowel, our argument to interactive is a lisp expression that return a list of 3 items. Like this:
(defun remove-vowel (ξstring &optional ξfrom ξto) "…" (interactive (if (region-active-p) (list nil (region-beginning) (region-end)) (let ((bds (bounds-of-thing-at-point 'paragraph)) ) (list nil (car bds) (cdr bds)) ) ) ) … )
If there's a text selection (region is active), it sets “ξstring” to nil and {ξfrom, ξto} to region {begin, end} positions.
If there's no text selection (region is not active), it sets “ξstring” to nil and {ξfrom, ξto} to paragraph's {begin, end} positions.
In both cases, the “ξstring” is set to nil, so the function will work on the region text.
(See: Using thing-at-point ◇ What's Region, Active Region, transient-mark-mode?)
The above takes care of interactive use of the function.
Now, remember that our function takes 3 arguments: {ξstring, ξfrom, ξto}. The {ξfrom, ξto} are optional. When “ξstring” is given (i.e. not nil), the function will take that as input and return a string. Otherwise, it takes {ξfrom, ξto} as region positions and transform text in the buffer.
For clarity, first we set “workOnStringP”:
(setq workOnStringP (if ξstring t nil))
then we set the “inputStr” like this:
(setq inputStr (if workOnStringP ξstring (buffer-substring-no-properties ξfrom ξto)))
Now, it works on the string, like this:
(setq outputStr (let ((case-fold-search t)) (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) )
Then, it either returns the outputStr or just change the region in buffer, depending whether “workOnStringP” is true, like this:
(if workOnStringP outputStr (save-excursion (delete-region ξfrom ξto) (goto-char ξfrom) (insert outputStr) ))
The weird ξ you see in my elisp code is Greek x. I use Unicode char in symbol name for easy distinction from builtin symbols. You can just ignore it. 〔☛ Programing Style: Variable Naming: English Words Considered Harmful〕
Emacs ♥