Elisp: Transform HTML Tags from “span” to “b”

By Xah Lee. Date: . Last updated: .

This page shows a simple practical elisp script for HTML tag transformation.

Problem

I want transform the HTML tag

<span class="w">…</span>

to

<b>…</b>

, for over a hundred files. Also, print a report of the changes.

Solution

Here's outline of steps.

  1. Open the file. Use regex to search the span markup.
  2. Make the replacement.
  3. Add the replacement to a list, for later report.
  4. Repeat the above until no more found.
  5. Use a dir traverse function to apply the above to every file. [see Elisp: Walk Directory]
  6. When done, print the list of changes.

Here's the code:

;; -*- coding: utf-8 -*-
;; 2011-07-18
;; replace <span class="w">…</span> to <b>…</b>
;;
;; do this for all files in a dir.

(setq inputDir "~/web/vocabulary/" ) ; dir should end with a slash

(setq changedItems '())

(defun my-process-file (fPath)
  "Process the file at FPATH …"
  (let (myBuff myWord)
    (setq myBuff (find-file fPath))

    (widen) (goto-char (point-min)) ;; in case buffer already open

    (while (re-search-forward "<span class=\"w\">\\([^<]+?\\)</span>" nil t)
      (setq myWord (match-string 1))
      (when (< (length myWord) 15) ; a little double check in case of possible mismatched tag
        (replace-match (concat "<b>" myWord "</b>" )  t)
        (setq changedItems (cons (substring-no-properties myWord) changedItems ) )
        ) )

    ;; close buffer if there's no change. Else leave it open.
    (when (not (buffer-modified-p myBuff)) (kill-buffer myBuff) )
    ) )

(require 'find-lisp)

(setq make-backup-files t)
(setq case-fold-search nil)
(setq case-replace nil)

(let (outputBuffer)
  (setq outputBuffer "*xah span.w to b replace output*" )
  (with-output-to-temp-buffer outputBuffer
    (mapc 'my-process-file (find-lisp-find-files inputDir "\\.html$"))
    (print changedItems)
    (princ "Done deal!")
    )
  )

Here's the output: elisp_batch_html_tag_transform_bold_output.txt.

There are over 1k changes. The output is extremely useful because i can just take a few seconds to glance at the output to know there are no errors. Errors are possible because whenever using regex to parse HTML, a missing tag in HTML or even a unexpected nested tag, can mean disaster.

The code is simple. If you don't understand it, see:

If you have a question, put $5 at patreon and message me on xah discord.
Or support me by Buy Xah Emacs Tutorial

Emacs Tutorial

Emacs Init

Emacs Keys

ELisp

ELisp Examples

ELisp Write Major Mode


ELisp Examples

Xah Commands

Text Transform Under Cursor

Commands Do thing-at-point

Command to Insert Things

Script Examples

Misc