ErgoEmacsEmacsLispBlogEmacsLispBuy Tutorial

Emacs Lisp: Multi-Pair String Replacement with Report

, , …,

This page shows you how to write a emacs lisp command that does multi-pair find/replace on current buffer, and also print a report of changed items.

Problem Description

Write a command “title-bracket-to-html-tag” that changes angle bracketed text into HTML tag. For example:

• 〈The Rise of “Worse is Better”〉 (1991) …
• 《The Unix-Hater's Handbook》 (1994) …

becomes:

• <cite>The Rise of “Worse is Better”</cite> (1991) …
• <cite class="book">The Unix-Hater's Handbook</cite> (1994) …

The command also should generate a report of all changes made, in a separate buffer.

(The single angle bracket is for article title, the double bracket is for book titles. 〔☛ Intro to Chinese Punctuation〕)

Solution

Here's a solution.

(defun title-bracket-to-html-tag ()
  "Replace all 〈…〉 to <cite>…</cite> in current buffer.
Also replace 《…》 to <cite class=\"book\">…</span>.
Generate a report of the replaced strings in a separate buffer."
  (interactive)
  (let ((changedItems '()))

    (save-excursion
      (goto-char (point-min))
      (while (search-forward-regexp "《\\([^》]+?\\)》" nil t)
        (setq changedItems (cons (match-string 1) changedItems ) )
        (replace-match "<cite class=\"book\">\\1</cite>" t)
        )

      (goto-char (point-min))
      (while (search-forward-regexp "〈\\([^〉]+?\\)〉" nil t)
        (setq changedItems (cons (match-string 1) changedItems ) )
        (replace-match "<cite>\\1</cite>" t) ) )

    (with-output-to-temp-buffer "*changed items*"
      (mapcar
       (lambda (myTitle)
         (princ myTitle)
         (princ "\n") )
       changedItems) ) ))

Here's a outline of the algorithm:

  1. Search forward by regex for 《…》
  2. If found, replace it with cite tag.
  3. Push the replacement into a list (for the report of changed items later).
  4. Repeat the above until no more title brackets found.
  5. do the same for 〈…〉.
  6. When no more found, print the list to a separate buffer.

All the functions in this code are very basic and is frequently used for text processing tasks. You should master them. You can just use this function as a template to write your own.

The code is easy to understand. If you find it difficult, have a look at Emacs Lisp Basics and Emacs Lisp Idioms.

Showing the changed items is important, because your text may have a mis-matched bracket. The output lets you verify correctness in a glance.

Example 2: Remove Wikipedia Citation Mark

In Wikipedia article, there are many citation marks like this: {[1], [12], …}. If you quote Wikipedia in your blog, those citation marks don't make sense and are distracting.

Here's a command to remove them.

The code is very similar. Replace each occurrence of [‹n›] and add it to a list then report it.

(defun remove-square-brackets  ()
  "Delete any text of the form “[‹n›]”.

Work on text selection or current line.
Print out in *changed items* buffer of all removed text.

For example, if text is on the line:
 「… was officially announced as Blu-ray Disc [11][12], and …」
then, after the call the line becomes:
 「… was officially announced as Blu-ray Disc, and …」."
  (interactive)

  (let (bdr p1 p2 inputStr resultStr changedItems)
    (setq bdr (get-selection-or-unit 'line) )
    (setq inputStr (elt bdr 0) p1 (elt bdr 1) p2 (elt bdr 2) )

    (setq changedItems '())

    (setq resultStr
          (with-temp-buffer
            (insert inputStr)

            (goto-char 1)
            (while (search-forward-regexp "\\(\\[[0-9]+?\\]\\)" nil t)
              (setq changedItems (cons (match-string 1) changedItems ) )
              (replace-match "" t) )
            (buffer-string)) )

    (delete-region p1 p2)
    (insert resultStr)

    (with-output-to-temp-buffer "*changed items*"
      (mapcar
       (lambda (myTitle)
         (princ myTitle)
         (princ "\n") ) changedItems) ) ) )

In the above, i used a convenient custom function get-selection-or-unit. You can replace it with thing-at-point 〔☛ Emacs Lisp: Using thing-at-point〕. Or, you get the package at: Emacs Lisp: get-selection-or-unit.

For many more examples of using multi-pair find/replace, see: Emacs Lisp Multi-Pair Find/Replace Applications.

blog comments powered by Disqus