Elisp: Writing a make-citation Command

By Xah Lee. Date: . Last updated: .

This page shows you how to write a emacs lisp command that transforms a text block under cursor into a specific citation format.

Problem

Write a elisp command so that when called, and if cursor is somewhere in a text like this:

Defective C++
By Yossi Kreinin
2007
http://yosefk.com/c++fqa/defective.html

It becomes this:

<cite>Defective C++</cite> (2007) By Yossi Kreinin. @ <a class="sorc" href="http://yosefk.com/c++fqa/defective.html" title="accessed:2011-08-15">Source yosefk.com</a>

Detail

I write many blogs. When i make a link, i like to also include the article title, author, date. This would help solving the link rot problem. (when a link is dead, at least the reader still knows the {title, author, date}.) For example, here's a typical link:

<a href="http://yosefk.com/c++fqa/defective.html">http://yosefk.com/c++fqa/defective.html</a>

I would like it to be like this:

<cite>Defective C++</cite> (2007) By Yossi Kreinin. @ <a class="sorc" href="http://yosefk.com/c++fqa/defective.html" title="accessed:2011-08-15">Source yosefk.com</a>

With proper Cascading Style Sheet (CSS), it is rendered in browsers like this:

Defective C++ By Yossi Kreinin. At http://yosefk.com/c++fqa/defective.html

It is quite tedious to get the {title, author, date} from a site. But once i got these info manually, i can automate the part of formatting. So, i start with this text:

Defective C++
By Yossi Kreinin
2007
http://yosefk.com/c++fqa/defective.html

Then, pressing a button, the text will be transformed to the desired HTML format.

If you are curious, the CSS used is this:

cite {color:#822222}
cite:before,cite:after {color:black;font-style:normal}
cite:before {content:"〈"}
cite:after {content:"〉"}
cite.book:before {content:"《"}
cite.book:after {content:"》"}

[see CSS Tutorial] The angle brackets is Asian convention for book/article titles. [see Intro to Chinese Punctuation]

Solution

Here's the outline of steps:

  1. Get the input text. (get their boundary positions)
  2. Split the input text by line break.
  3. Process each line into proper format.
  4. Delete the input text.
  5. Insert new next.

Here's the code:

(defun make-citation ()
  "Reformat current text block or selection into a canonical citation format.

For example, place cursor somewhere in the following block:

Circus Maximalist
By PAUL GRAY
Monday, Sep. 12, 1994
http://www.time.com/time/magazine/article/0,9171,981408,00.html

After execution, the lines will become

<cite>Circus Maximalist</cite> (1994-09-12) By Paul Gray. @ <a href=\"http://www.time.com/time/magazine/article/0,9171,981408,00.html\">Source www.time.com</a>

If there's a text selection, use it for input, otherwise the input is a text block between empty lines."
  (interactive)
  (let (bds p1 p2 inputText myList $title $author $date $url )

    (setq bds (get-selection-or-unit 'block))
    (setq inputText (elt bds 0) )
    (setq p1 (elt bds 1) )
    (setq p2 (elt bds 2) )

    (setq inputText (replace-regexp-in-string "^[[:space:]]*" "" inputText)) ; remove white space in front

    (setq myList (split-string inputText "[[:space:]]*\n[[:space:]]*" t) )

    (setq $title (elt myList 0))
    (setq $author (elt myList 1))
    (setq $date (elt myList 2))
    (setq $url (elt myList 3))

    (setq $author (replace-regexp-in-string "\\. " " " $author)) ; remove period in Initals
    (setq $author (replace-regexp-in-string "By +" "" $author))
    (setq $author (upcase-initials (downcase $author)))
    (setq $date (fix-timestamp-string $date))

    (setq $url (with-temp-buffer (insert $url) (source-linkify) (buffer-string)))

    (delete-region p1 p2 )
    (insert (concat "<cite>" $title "</cite>") " " "(" $date ")"  " By " $author ". @ " $url)
    ))

The code is pretty simple. Grabbing the text is done by:

(setq bds (get-selection-or-unit 'block))
(setq $inputText (elt bds 0) )
(setq p1 (elt bds 1) )
(setq p2 (elt bds 2) )

The get-selection-or-unit is my custom function as a replacement for elisp's thing-at-point. It returns a vector [‹text› ‹begin boundary› ‹end boundary›]. (See: Elisp: Using thing-at-point for detail.)

Then, we split the text into lines:

(setq myList (split-string $inputText "[[:space:]]*\n[[:space:]]*" t) )

(setq $title (elt myList 0))
(setq $author (elt myList 1))
(setq $date (elt myList 2))
(setq $url (elt myList 3))

process each line:

(setq $author (replace-regexp-in-string "\\. " " " $author)) ; remove period in Initals 
(setq $author (replace-regexp-in-string "By +" "" $author))
(setq $author (upcase-initials (downcase $author))) ; some site has author name in all caps
(setq $date (fix-timestamp-string $date)) ; transform the date format to yyyy-mm-dd

The fix-timestamp-string transforms arbitrary datetime format into a canonical form yyyy-mm-dd. (ISO 8601) For examples:

See: Elisp: Writing a Date Time String Parsing Function.

Now, we change the URL into a link:

(setq $url (with-temp-buffer (insert $url) (source-linkify) (buffer-string)))

The source-linkify is a command i wrote to change URL to a link into a special format for my own blogs. For example, it changes this:

http://yosefk.com/c++fqa/defective.html

into this:

<a class="sorc" href="http://yosefk.com/c++fqa/defective.html" title="accessed:2011-08-16">Source yosefk.com</a>

For detail of source-linkify, see: Elisp: Change URL into HTML Link.

Finally, the code deletes the input text, and insert the new:

(delete-region p1 p2 )
(insert (concat "<cite>" $title "</cite>") " " "(" $date ")"  " By " $author ". @ " $url)

The weird ξ you see in my elisp code is Greek x. I use Unicode char in symbol name for easy distinction from builtin symbols. You can just ignore it. [see Programing Style: Variable Naming: English Words Considered Harmful]

For a full version, see xah-html-make-citation in Emacs: Xah HTML Mode.

Liket it? Put $1 at patreon. Or Buy Xah Emacs Tutorial. Thanks.