Emacs Lisp: Add “alt” Attribute to Image Tags

Master emacs+lisp, benefit for life. Testimonials. Thank you for support.
, , …,

This page shows a example of using emacs's regex to update HTML image tags on all files in a directory.

Problem Description

For all HTML image tags of the form:

<img src="paraboloid.png" alt="" width="832" height="513">

Add a value to the “alt” attribute. The value should be the image file name, but without file extension. ⁖ alt="paraboloid".

This needs to be done for about 100 files inside a dir and subdir.

Solution

List and Mark Files in Subdirectories

Call find-dired, then give the dir name, then give -name "*html". The result is all HTML files in that dir and subdir.

Now, mark the files you want, by typing 【% m】 (dired-mark-files-regexp). Then give the pattern \.html. This marks all HTML files.

Dired Query Replace by Regexp

To do regexp replace on a bunch of files, call dired-do-query-replace-regexp. 〔☛ Emacs: Interactively Find/Replace String Patterns on Multiple Files

The Search Pattern

The next job is to give regex search pattern. This is simple:

<img src="\([^"]+?\)" alt="" width="\([0-9]+\)" height="\([0-9]+\)">

This regex captures the file name, the width and height.

〔☛ Text Pattern Matching in Emacs (emacs regex tutorial)

Using Elisp Expression for Replacement String

Since emacs 22, it allows you to give a elisp expression for the replacement, by using this syntax for the replacement string: \,‹elisp code›, where ‹elisp code› is lisp expression.

The heart of this task is to write the elisp function that gives us the replacement string, where the alt part is the transformed version of the file name. This is surprisingly simple too. Here's the lisp expression we need:

(concat
 "<img src=\""
 (match-string 1)
 "\" alt=\""
 (replace-regexp-in-string ".png" ""
    (replace-regexp-in-string "_" " " (match-string 1)))
 "\" width=\""
 (match-string 2)
 "\" height=\""
 (match-string 3)
 "\">"
 )

The match-string simply give us the matched values. The interesting part is the replace-regexp-in-string we used to generate the value for alt. First, we replace “_” to space, then we delete the “.png”. That's all there is to it.

Finally, we call dired-do-query-replace-regexp in the dired buffer (hotkey is Q). 〔☛ Emacs: Interactively Find/Replace String Patterns on Multiple Files

Without emacs, the above operation might take a hour or two and is tedious and error prone. With expertise in Perl or Python scripting, the problem is lack of interactive see-and-do. With emacs, the whole operation is less than 5 minutes.

Advantage of Interactive Regex Replace on Multiple Files

Suppose you are given a task where hundreds of valid HTML files in a dir needs to be converted to valid XHTML. Note that XHTML has a slightly different syntax. For example, all tags such as <p> and <li> now needs to be closed. Tags like <img>, <hr>, <br> need to be like <img … />, <hr/>, <br/>. Also, tags are now case sensitive, so you need to lower case them. Also, image tags now must be wrapped inside a container tag, such as <div>. The DTD also needs to be changed, and there are many style oriented tags that needs to be transformed.

This task seems daunting. You could try a Perl script in one shot, but it would probably take you days to code it correctly, and if your script has a parsing or regex error, it'll delete parts of your files without you knowing it. You could do a trial and error approach by regex replacement experimentally one at a time. Still, your script goes batch. If you make a mistake, you'll have to revert all your files. With mastery of emacs, you can do the above transform using regex find/replace one by one, interactively and safely, saving your time some 10 fold.

Like what you read?
Buy Xah Emacs Tutorial
or share some
blog comments powered by Disqus