Emacs: Unicode Tutorial

Buy Xah Emacs Tutorial. Master emacs benefits for life.
, , …,

This page is a tutorial on using emacs and Unicode. ⁖ how to type math symbol, how to switch input methods, finding Unicode character name or code point, how to set file encoding, ….

Carbon emacs 22 Unicode
A screenshot of emacs window showing Unicode chars. You can download this text here: unicode.txt. (not exactly the same)

This page covers emacs 23 or later. You should use emacs 23 (released in 2009) or later, because emacs 23 has major update of its character engine. 〔➤ Emacs 23.1 New Features (released 2009-07)

How to set default file encoding?

Put this in your emacs init:

(set-language-environment "UTF-8")
(set-default-coding-systems 'utf-8)

UTF-8 is becoming the standard for file encoding. I recommend it highly.

See also:

Typing Unicode Characters

How to type this character é ?

Here's a table on how to type these chars:

CharacterKey Press
éCtrl+x 8 ' e
àCtrl+x 8 ` a
îCtrl+x 8 ^ i
ñCtrl+x 8 ~ n
üCtrl+x 8 " u

To see all characters you can type this way, press 【Ctrl+x 8 Ctrl+h】. Example: ¿ ¡ ¢ £ ¥ ¤ § ¶ ® © ª «» × ÷ ¬ ° ± µ ÀÁÂÃÄÅÆ Ç ÈÉÊË ÌÍÎÏ ÐÑ ÒÓÔÕÖ ØÙÚÛÜÝÞß àáâãäåæç èéêë ìíîï ðñòóôõö øùúûüýþÿ.

If you need to type these chars often, call set-input-method and give “latin-9-prefix”. That will allow you to type these chars without typing 【Ctrl+x 8】 first.

(Emacs's “latin-9-prefix” corresponds to the char set ISO 8859-9)

A even better way is to install Emacs: xah-math-input.el. With that, to type é, just type e' followed by a activation key.

How to insert a Unicode character by name?

Call insert-charCtrl+x 8 Enter ↵】, then the name of the Unicode. For example, try insert . Its name is “RIGHTWARDS ARROW”.

You can use asterisk * to match chars. For example, call insert-char, then type *arrow then Tab ↹, then emacs will show all chars with “arrow” in their names.

Note: before emacs 24, you need to call ucs-insert instead of insert-char. ucs-insert is obsolete since 24.3.

How to insert a Unicode character by its hexadecimal value?

Call insert-charCtrl+x 8 Enter ↵】, then the hex of the Unicode. For example, try insert . Its hex value is “2192”.

How to insert a Unicode character by its decimal value?

Call eval-expression, then type (insert-char 8594) for .

A more convenient way is to install Emacs: xah-math-input.el.

How to open a Unicode character palette?

You can put frequently used Unicode chars into a file and save it, and define a keystroke to open this file, so that you can copy & paste the chars you want. Here's how you can define a keystroke to open a file. Put the following in your emacs init file.

; open my Unicode template with F8 key
(global-set-key (kbd "<f8>")
  (lambda () (interactive) (find-file "~/emacs.d/my_unicode_template.txt")))

Or, you can define a abbrev, such as “sym”, then it'll expand into a list of unicode characters. 〔➤ Using Emacs Abbrev Mode for Abbreviation

Here's a example of a template: unicode.txt.

How to set a keystroke to insert a Unicode char?

For example, put the following code in your emacs init file.

(global-set-key (kbd "<f9> a") (lambda () (interactive) (insert "α"))) ; F9 followed by a
(global-set-key (kbd "<f9> b") (lambda () (interactive) (insert "β")))

Alternatively, you can use key-translation-map. For detail, see: Emacs: Remapping Keys Using key-translation-map. For OS-wide, see: How to Create a APL or Math Symbols Keyboard Layout.

How to use abbrev to input Unicode chars?

Put the following in your emacs init file:

(define-abbrev-table 'global-abbrev-table '(
    ("alpha" "α")
    ("beta" "β")
    ("gamma" "γ")
    ("theta" "θ")
    ("inf" "∞")

    ("ar1" "→")
    ("ar2" "⇒")

(abbrev-mode 1) ; turn on abbrev mode

Select the code above and call eval-regionAlt+x】.

Now, type alpha , it will become “α ”.

See: Using Emacs Abbrev Mode for Abbreviation.

xmsi Math Symbols Input Mode

If you type math symbols often, use Emacs: xah-math-input.el.

Typing Chinese or Non-Latin Languages

How to type Chinese?

Call set-input-methodCtrl+x Enter ↵ Ctrl+\】, then give value chinese-py. (“chinese-py” is a basic Chinese pinyin input method.)

For detail, see: Emacs Chinese Input for Studying Chinese.

To switch back, call toggle-input-methodCtrl+\】.

How to find out what's the current input method?

Call describe-variableF1 v】 then type current-input-method.

Finding Info About a Character

I have this character α on the screen. How to find out its Unicode's hex value or name?

You can find out a char's info by placing your cursor on the character then call describe-char.

Following is the output of describe-char on char “α” in Emacs 24:

position: 9878 of 14133 (70%), column: 82
            character: α (displayed as α) (codepoint 945, #o1661, #x3b1)
    preferred charset: unicode-bmp (Unicode Basic Multilingual Plane (U+0000..U+FFFF))
code point in charset: 0x03B1
               syntax: w        which means: word
             category: .:Base, G:2-byte Greek, L:Left-to-right (strong), c:Chinese, g:Greek, h:Korean, j:Japanese
             to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
          buffer code: #xCE #xB1
            file code: #xCE #xB1 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    xft:-unknown-DejaVu Sans Mono-normal-normal-normal-*-13-*-*-*-m-0-iso10646-1 (#x2F7)

Character code properties: customize what to show
  general-category: Ll (Letter, Lowercase)
  decomposition: (945) ('α')

There are text properties here:
  face                 xah-html-curly“”-quoted-text-face
  fontified            t

Unicode version 6 (released in ) added about 1k more symbols. Emacs 23.2 does not have info on these new symbols. (⁖ 😸 GRINNING CAT FACE WITH SMILING EYES) 〔➤ Unicode 6 Emoticons

You can get these info by downloading a Unicode data file and let emacs know where it is. Download it at: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt, then, place the following code in your “.emacs”.

;; set Unicode data file location. (used by what-cursor-position and describe-char)
(let ((x "~/emacs.d/UnicodeData.txt"))
  (when (file-exists-p x)
    (setq describe-char-unicodedata-file x)))

Select the above code, then call eval-region. Then, you will have full Unicode char info when calling describe-char.

See also: xub Unicode Browser mode for Emacs.

How to get emacs to display missing emoticon?

Emacs: How to List & Set Font

Emacs File/Character Encoding/Decoding FAQ

Emacs File/Character Encoding/Decoding FAQ

Emacs Lisp: Unicode Representation in String

Emacs Lisp: Unicode Representation in String

More About Unicode

Unicode Characters Search ☢ ☯ ☭ ∑ ∞ ♀ ♂ ♥

Unicode Characters Search ☢ ☯ ☭ ∑ ∞ ♀ ♂ ♥

Like it? Buy Xah Emacs Tutorial.
blog comments powered by Disqus