ErgoEmacsEmacsLispBlogEmacsLispBuy Tutorial
Web Hosting by 1&1
2.3.3.2 General Escape Syntax

In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.

You can specify characters by their Unicode values. ?\unnnn represents a character that maps to the Unicode code point ‘U+nnnn’ (by convention, Unicode code points are given in hexadecimal). There is a slightly different syntax for specifying characters with code points higher than U+ffff: \U00nnnnnn represents the character whose code point is ‘U+nnnnnn’. The Unicode Standard only defines code points up to ‘U+10ffff’, so if you specify a code point higher than that, Emacs signals an error.

This peculiar and inconvenient syntax was adopted for compatibility with other programming languages. Unlike some other languages, Emacs Lisp supports this syntax only in character literals and strings.

The most general read syntax for a character represents the character code in either octal or hex. To use octal, write a question mark followed by a backslash and the octal character code (up to three octal digits); thus, ‘?\101’ for the character A, ‘?\001’ for the character C-a, and ?\002 for the character C-b. Although this syntax can represent any ASCII character, it is preferred only when the precise octal value is more important than the ASCII representation.

     ?\012 ⇒ 10         ?\n ⇒ 10         ?\C-j ⇒ 10
     ?\101 ⇒ 65         ?A ⇒ 65

To use hex, write a question mark followed by a backslash, ‘x’, and the hexadecimal character code. You can use any number of hex digits, so you can represent any character code in this way. Thus, ‘?\x41’ for the character A, ‘?\x1’ for the character C-a, and ?\x8e0 for the Latin-1 character ‘a’ with grave accent.

blog comments powered by Disqus