In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.
You can specify characters by their Unicode values.
?\unnnn represents a character that maps to the Unicode
code point ‘U+nnnn’ (by convention, Unicode code points are
given in hexadecimal). There is a slightly different syntax for
specifying characters with code points higher than
\U00nnnnnn represents the character
whose code point is ‘U+nnnnnn’. The Unicode Standard only
defines code points up to ‘U+10ffff’, so if you specify a
code point higher than that, Emacs signals an error.
This peculiar and inconvenient syntax was adopted for compatibility with other programming languages. Unlike some other languages, Emacs Lisp supports this syntax only in character literals and strings.
The most general read syntax for a character represents the
character code in either octal or hex. To use octal, write a question
mark followed by a backslash and the octal character code (up to three
octal digits); thus, ‘?\101’ for the character A,
‘?\001’ for the character C-a, and
?\002 for the
character C-b. Although this syntax can represent any
ASCII character, it is preferred only when the precise octal
value is more important than the ASCII representation.
?\012 ⇒ 10 ?\n ⇒ 10 ?\C-j ⇒ 10 ?\101 ⇒ 65 ?A ⇒ 65
To use hex, write a question mark followed by a backslash, ‘x’,
and the hexadecimal character code. You can use any number of hex
digits, so you can represent any character code in this way.
Thus, ‘?\x41’ for the character A, ‘?\x1’ for the
character C-a, and
?\x8e0 for the Latin-1 character
‘a’ with grave accent.