Elisp: Chinese Character Reference Linkify

By Xah Lee. Date: . Last updated: .

A command to change the character under cursor into several HTML links.

Problem

For example, this

becomes:

<div class="chinese84873">
 <span class="w">中</span>
 <span class="en">
  <a href="http://translate.google.com/#zh-CN|en|中">Translate</a> ◇
  <a href="http://en.wiktionary.org/wiki/中">Wiktionary</a> ◇
  <a href="http://www.chineseetymology.org/CharacterEtymology.aspx?submitButton1=Etymology&amp;characterInput=中">history</a>
</span>
</div>

Appears in browser like this:

TranslateWiktionaryhistory

This is useful for writing blog on languages and linguistics. For example, see: Poem: The Bell Tolls for Thee; 鐘為汝鳴.

Solution

Here's the code:

(defun xah-words-chinese-linkify ()
  "Make the Chinese character before cursor into Chinese dictionary reference links.
URL `http://ergoemacs.org/emacs/elisp_chinese_char_linkify.html'
Version 2015-05-01"
  (interactive)
  (let (
        ($template
         "<div class=\"chinese-etymology-96656\"><b class=\"w\">�</b> <span class=\"en\"><a href=\"http://translate.google.com/#zh-CN|en|�\">Translate</a> ◇ <a href=\"http://en.wiktionary.org/wiki/�\">Wiktionary</a> ◇ <a href=\"http://www.chineseetymology.org/CharacterEtymology.aspx?submitButton1=Etymology&amp;characterInput=�\">history</a></span></div>"
         )
        ($char (buffer-substring-no-properties (- (point) 1) (point))))
    (delete-char -1)
    (insert (replace-regexp-in-string "�" $char $template))))

This is truely a time saver.

URL Encoding of Chinese

Note: technically, Chinese chars in a URL should be URL Encoded.

For example:

http://en.wiktionary.org/wiki/中

should be:

http://en.wiktionary.org/wiki/%E4%B8%AD

(A Chinese character should become bytes in hex from the char's UTF-8 encoding. The char 中's UTF-8 encoding is 3 bytes of the following hexadecimal: E4 B8 AD.)

However, i think the situation of percent encoding is a abomination. [see Problems of Symbol Congestion in Computer Languages; ASCII Jam vs Unicode] I decided to not botch my Chinese chars in URL. This does not cause practical problems.

If you want to do that, see: Elisp: URL Percent Decode/Encode

Liket it? Put $1 at patreon. Or Buy Xah Emacs Tutorial. Thanks.