Emacs: Unicode Tutorial

Master emacs+lisp, benefit for life. Testimonials. Thank you for support.
, , …,

This page is a tutorial on using emacs and Unicode. ⁖ how to type math symbol, how to switch input methods, finding Unicode character name or code point, how to set file encoding, ….

Carbon emacs 22 Unicode
A screenshot of emacs window showing Unicode chars. You can download this text here: unicode.txt. (not exactly the same)

This page covers emacs 23 or later. You should use emacs 23 (released in 2009) or later, because emacs 23 has major update of its character engine. 〔☛ New Features in Emacs 23

Typing Unicode Characters

How to type this character é ?

Here's a table on how to type these chars:

CharacterKey Press
éCtrl+x 8 ' e
àCtrl+x 8 ` a
îCtrl+x 8 ^ i
ñCtrl+x 8 ~ n
üCtrl+x 8 " u

To see all characters you can type this way, press 【Ctrl+x 8 Ctrl+h】. Example: ¿ ¡ ¢ £ ¥ ¤ § ¶ ® © ª «» × ÷ ¬ ° ± µ ÀÁÂÃÄÅÆ Ç ÈÉÊË ÌÍÎÏ ÐÑ ÒÓÔÕÖ ØÙÚÛÜÝÞß àáâãäåæç èéêë ìíîï ðñòóôõö øùúûüýþÿ.

If you need to type these chars often, call set-input-method and give “latin-9-prefix”. That will allow you to type these chars without typing 【Ctrl+x 8】 first.

(Emacs's “latin-9-prefix” corresponds to the char set ISO 8859-9)

How to insert a Unicode character by name?

Call insert-charCtrl+x 8 Enter ↵】, then the name of the Unicode. For example, try insert “λ”. Its name is “GREEK SMALL LETTER LAMDA”.

You can use asterisk * to match chars. For example, call insert-char, then type *arrow then Tab ↹, then emacs will show all chars with “arrow” in their names.

Note: before emacs 24, you need to call ucs-insert instead of insert-char. ucs-insert is obsolete since 24.3.

How to insert a Unicode character by its hexadecimal value?

Call insert-charCtrl+x 8 Enter ↵】, then the hex of the Unicode. For example, try insert “λ”. Its hex value is “3bb”.

How to insert a Unicode character by its decimal value?

Call eval-expression, then type (insert-char 955).

How to open a Unicode character palette?

You can put frequently used Unicode chars into a file and save it, and define a keystroke to open this file, so that you can copy & paste the chars you want. Here's how you can define a keystroke to open a file. Put the following in your emacs init file.

; open my Unicode template with F8 key
(global-set-key (kbd "<f8>")
  (lambda () (interactive) (find-file "~/emacs.d/my_unicode_template.txt")))

Here's a example of a template: unicode.txt.

How to set a keystroke to insert a Unicode char?

For example, put the following code in your emacs init file.

(global-set-key (kbd "<f9> a") (lambda () (interactive) (insert "α"))) ; F9 followed by a
(global-set-key (kbd "<f9> b") (lambda () (interactive) (insert "β")))

Alternatively, you can use key-translation-map. For detail, see: Emacs: Remapping Keys Using key-translation-map. For OS-wide, see: How to Create a APL or Math Symbols Keyboard Layout.

How to use abbrev to input Unicode chars?

Put the following in your emacs init file:

(define-abbrev-table 'global-abbrev-table '(
    ("alpha" "α")
    ("beta" "β")
    ("gamma" "γ")
    ("theta" "θ")
    ("inf" "∞")

    ("ar1" "→")
    ("ar2" "⇒")
    ))

(abbrev-mode 1) ; turn on abbrev mode

Select the code above and call eval-regionAlt+x】.

Now, type alpha , it will become “α ”.

See: Using Emacs Abbrev Mode for Abbreviation.

xmsi Math Symbols Input Mode

If you type math symbols often, buy my Emacs Unicode Math Symbols Input Mode (xmsi-mode).

Typing Chinese or Non-Latin Languages

How to type Chinese?

Call set-input-methodCtrl+x Enter ↵ Ctrl+\】, then give value chinese-py. (“chinese-py” is a basic Chinese pinyin input method.)

Then, you can start typing Chinese. For detail, see: Emacs Chinese Input for Studying Chinese.

To switch back, call toggle-input-methodCtrl+\】.

How to find out what's the current input method?

Call describe-variableF1 v】 then type current-input-method.

Finding Info About a Character

I have this character α on the screen. How to find out its Unicode's hex value or name?

You can find out a char's info by placing your cursor on the character then call describe-char.

Following is the output of describe-char on char “α” in Emacs 24:

position: 9878 of 14133 (70%), column: 82
            character: α (displayed as α) (codepoint 945, #o1661, #x3b1)
    preferred charset: unicode-bmp (Unicode Basic Multilingual Plane (U+0000..U+FFFF))
code point in charset: 0x03B1
               syntax: w        which means: word
             category: .:Base, G:2-byte Greek, L:Left-to-right (strong), c:Chinese, g:Greek, h:Korean, j:Japanese
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xCE #xB1
            file code: #xCE #xB1 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    xft:-unknown-DejaVu Sans Mono-normal-normal-normal-*-13-*-*-*-m-0-iso10646-1 (#x2F7)

Character code properties: customize what to show
  name: GREEK SMALL LETTER ALPHA
  general-category: Ll (Letter, Lowercase)
  decomposition: (945) ('α')

There are text properties here:
  face                 xhm-curly“”-quoted-text-face
  fontified            t

Unicode version 6 (released in 2010-10) added about 1k more symbols. Emacs 23.2 does not have info on these new symbols. (⁖ 😸 GRINNING CAT FACE WITH SMILING EYES) 〔☛ Unicode 6 Emoticons

You can get these info by downloading a Unicode data file and let emacs know where it is. Download it at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt, then, place the following code in your “.emacs”.

;; set Unicode data file location. (used by what-cursor-position and describe-char)
(let ((x "~/emacs.d/UnicodeData.txt"))
  (when (file-exists-p x)
    (setq describe-char-unicodedata-file x)))

Select the above code, then call eval-region. Then, you will have full Unicode char info when calling describe-char.

See also: xub Unicode Browser mode for Emacs.

More About Unicode

Like what you read?
Buy Xah Emacs Tutorial
or share some
blog comments powered by Disqus