This page is a lisp tutorial on writing a command that transform the text under cursor from a URL into a link wrapped with anchor tag <a …>…</a>.
I want a command, so that, when pressing a button, the URL under cursor, such as:
http://some.example.com/xyz.html
becomes this:
<a class="sorc" href="http://some.example.com/xyz.html" title="accessed:2011-05-16">Source some.example.com</a>
with today's date automatically inserted to the “accessed” part.
I also want a command to transform the above link into a defunct format, like this:
<span class="sorcdd" title="accessed:2011-05-16; defunct:2011-05-16; http://some.example.com/xyz.html">Source some.example.com</span>
with today's date added to the “defunct” part.
In writing blogs, often you need to cite links. The links may be other blogs, news sites, or some random site. Many such URL are ephemeral. They exist today, but may become a dead link few months later. Typically, if the URL doesn't have a dedicated domain, it is more likely to go bad sooner.
I write many blogs, so have hundreds of links. When you update your pages years later, you find dead links like〔http://someRandomBlog.org/importantToday.html〕, and may not remember what that link is about. No author, no title, no idea when that link became dead. Sometimes, domain name owner of the link changed, so the linked page may became a porn site.
One partial solution is to add access date together with the link, like this:
<p>I found a fantastic <a href="http://some.example.com/xyz.html">emacs blog</a> (accessed on 2010-12-03) today!</p>
With a access date, at least you know when the link was good. If the link went bad, you or your readers can at least try to see the link thru web archive site such as Wayback Machine.
However, this requires manual insertion of the date. Also, the “accessed on” info in your content is very distracting.
It would be better, if the access date is somehow embedded in the link, and in some uniform format. HTML4 or even HTML5 does not have a attribute for access date. I decided to add the access date into the “title” attribute, like this:
<a class="sorc" href="http://example.com/" title="accessed:2011-05-16">Source example.com</a>
This is not a ideal solution, because the “title” attribute is supposed to be title, not a date stamp. But in practice, i decided it's ok for me to adopt this solution.
When later on if i found a link is dead, i can press a button, and emacs will change the link to this format:
<span class="sorcdd" title="accessed:2011-05-16; defunct:2011-05-16; http://example.com/">Source example.com</span>
Notice that the value of the “class” attribute has changed from “sorc” to “sorcdd” (“dd” for “dead”). With proper CSS, the link will be shown as crossed out. Like this: Source some.example.com.
A uniform format is good. Because, if later on HTML6 or other HTML Microformat has a way to add access date to links, i can write a script that reliably change all external links to the new format.
Here's the code:
(defun source-linkify () "Make URL at cursor point into a HTML link. If there's a text selection, use the text selection as input. Example: http://example.com/xyz.htm becomes <a class=\"sorc\" href=\"http://example.com/xyz.htm\" title=\"accessed:2008-12-25\">Source example.com</a>" (interactive) (let (url resultLinkStr bds p1 p2 domainName) ;; get the boundary of URL or text selection (if (region-active-p) (setq bds (cons (region-beginning) (region-end)) ) (setq bds (bounds-of-thing-at-point 'url)) ) ;; set URL (setq p1 (car bds)) (setq p2 (cdr bds)) (setq url (buffer-substring-no-properties p1 p2)) ;; get the domainName (string-match "://\\([^\/]+?\\)/" url) (setq domainName (match-string 1 url)) (setq url (replace-regexp-in-string "&" "&" url)) (setq resultLinkStr (concat "<a class=\"sorc\" href=\"" url "\"" " title=\"accessed:" (format-time-string "%Y-%m-%d") "\"" ">" "Source " domainName "</a>")) ;; delete url and insert the link (delete-region p1 p2) (insert resultLinkStr)))
The code is easy to understand. If you find it difficult, try reading this page Emacs Lisp: Writing a Wrap-URL Function, which has more explanation.
You can assign a hotkey for this command.
The following is the code to turn a link into a dead link format.
(defun defunct-link () "Make the HTML link under cursor to a defunct form. Example: If cursor is inside this tag <a class=\"sorc\" href=\"http://example.com/\" title=\"accessed:2008-12-26\">…</a> (and inside the opening tag.) It becomes: <span class=\"sorcdd\" title=\"accessed:2008-12-26; defunct:2008-12-26; http://example.com\">…</span>" (interactive) (let (p1 p2 wholeLinkStr newLinkStr ξurl titleStr anchorText) (save-excursion ;; get the boundary of opening tag (forward-char 3) (search-backward "<a " ) (setq p1 (point) ) (search-forward "</a>") (setq p2 (point) ) ;; get wholeLinkStr (setq wholeLinkStr (buffer-substring-no-properties p1 p2)) ;; generate replacement text (with-temp-buffer (insert wholeLinkStr) (goto-char 1) (search-forward-regexp "href=\"\\([^\"]+?\\)\"") (setq ξurl (match-string 1)) (search-forward-regexp "title=\"\\([^\"]+?\\)\"") (setq titleStr (match-string 1)) (search-forward-regexp ">\\([^<]+?\\)</a>") (setq anchorText (match-string 1)) (setq newLinkStr (concat "<span class=\"sorcdd\" " "title=\"" (concat titleStr "; defunct:" (format-time-string "%Y-%m-%d") "; " ξurl ) "\">" anchorText "</span>") ))) (delete-region p1 p2) (insert newLinkStr)))
Here's the CSS for the dead link:
span.sorcdd {text-decoration:line-through}
Elisp is fantastic!
If you want a link format that preserves citation, such as author, article title, see: Emacs Lisp: Writing a make-citation Command.
Addendum: HTML5 has Custom Data Attribute. It can be used to embed access date of a link. See: HTML5 Custom Data Attribute.