Elisp: Transform HTML Tags from “span” to “b”

By Xah Lee. Date: . Last updated: .

This page shows a simple practical elisp script for HTML tag transformation.

Problem

I want transform the HTML tag

<span class="w">…</span>

to

<b>…</b>

, for over a hundred files. Also, print a report of the changes.

Solution

Here's outline of steps.

  1. Open the file. Use regex to search the span markup.
  2. Make the replacement.
  3. Add the replacement to a list, for later report.
  4. Repeat the above until no more found.
  5. Use a dir traverse function to apply the above to every file. [see Elisp: Walk Directory]
  6. When done, print the list of changes.

Here's the code:

;; -*- coding: utf-8 -*-
;; 2011-07-18
;; replace <span class="w">…</span> to <b>…</b>
;;
;; do this for all files in a dir.

(setq inputDir "~/web/vocabulary/" ) ; dir should end with a slash

(setq changedItems '())

(defun my-process-file (fPath)
  "Process the file at FPATH …"
  (let (myBuff myWord)
    (setq myBuff (find-file fPath))

    (widen) (goto-char 1) ;; in case buffer already open

    (while (re-search-forward "<span class=\"w\">\\([^<]+?\\)</span>" nil t)
      (setq myWord (match-string 1))
      (when (< (length myWord) 15) ; a little double check in case of possible mismatched tag
        (replace-match (concat "<b>" myWord "</b>" )  t)
        (setq changedItems (cons (substring-no-properties myWord) changedItems ) )
        ) )

    ;; close buffer if there's no change. Else leave it open.
    (when (not (buffer-modified-p myBuff)) (kill-buffer myBuff) )
    ) )

(require 'find-lisp)

(setq make-backup-files t)
(setq case-fold-search nil)
(setq case-replace nil)

(let (outputBuffer)
  (setq outputBuffer "*xah span.w to b replace output*" )
  (with-output-to-temp-buffer outputBuffer
    (mapc 'my-process-file (find-lisp-find-files inputDir "\\.html$"))
    (print changedItems)
    (princ "Done deal!")
    )
  )

Here's the output: elisp_batch_html_tag_transform_bold_output.txt.

There are over 1k changes. The output is extremely useful because i can just take a few seconds to glance at the output to know there are no errors. Errors are possible because whenever using regex to parse HTML, a missing tag in HTML or even a unexpected nested tag, can mean disaster.

The code is simple. If you don't understand it, see:

Elisp Script Examples

  1. Write grep in Elisp
  2. Find String Inside HTML Tag
  3. Validate Matching Brackets
  4. Generate Links Report
  5. Generate Sitemap
  6. Archive Website For Reader Download
  7. Process File line-by-line
  8. Text-Soup Automation
  9. Split HTML Annotation
  10. Fixing Dead Links
  11. Elisp vs Perl: Validate Local File Links
  12. Transform Page Tag
  13. Transform HTML FAQ Tags
  14. Transform HTML Tags
  15. “figure” to “figcaption”
  16. “span.w” to “b”

If you have a question, put $5 at patreon and message me.
Or Buy Xah Emacs Tutorial
Or buy a nice keyboard: Best Keyboards for Emacs

Emacs

Emacs Lisp

Misc