# Emacs Lisp: Unicode Representation in String

By Xah Lee. Date: . Last updated: .

In emacs lisp string, you can have Unicode characters directly (For example, `"I ♥ 😸"`), or, you can represent Unicode char by the following syntax:

• `"\uxxxx"` → A Unicode char. xxxx must be 4 hexadecimal digits, representing the char's codepoint in hex. You need to pad it with 0 if the codepoint is less than 4 hex digits.
• `"\U00xxxxxx"` → A Unicode char. xxxxxx must be 6 hexadecimal digits, representing the char's codepoint in hex. You need to pad it with 0 if the codepoint is less than 6 hex digits.

Note: the syntax is a bit ugly. Which one to use depends on whether the Unicode is in the range of 0 to 4 hex digits. (Each Unicode char is given a integer id, called its “codepoint”. 〔►see Unicode Basics: What's Character Set, Character Encoding, UTF-8?〕)

```;; examples of Unicode char representation in string

;; lower case “a”
(search-forward "\u0061" )

;; ♥ BLACK HEART SUIT codepoint 9829, #x2665
(search-forward "\u2665" )

;; 😸 GRINNING CAT FACE WITH SMILING EYES codepoint 128568, #x1f638
(search-forward "\U0001f638" )

;; ♥ 😸```

in the above example, the letter a's Unicode hex is just “61”, so you need to pad it with “00”.

in the above example, the grinning cat 😸's codepoint in hex is 5 digits. So, you need to use the `"\U00xxxxxx"` form, and because it's less than 6 digits, so you need to pad it with “0”, resulting “000” there.

Note: you can find a Unicode char's codepoint by calling `describe-char`. 〔►see Emacs: Unicode Tutorial

## Why is Encoded Unicode Char Useful?

The use of encoded representation is useful when you want to represent non-printable chars, such as {RIGHT-TO-LEFT MARK, ZERO WIDTH NO-BREAK SPACE, NO-BREAK SPACE}. Example:

```(defun replace-BOM-mark-etc ()
"Query replace some invisible Unicode chars.
The chars to be searched are:
RIGHT-TO-LEFT MARK 8207 x200f
ZERO WIDTH NO-BREAK SPACE 65279 xfeff

start on cursor position to end."
(interactive)
(query-replace-regexp "\u200f\\|\ufeff" ""))```

see

(info "(elisp) General Escape Syntax")

Like it? Buy Xah Emacs Tutorial. Thanks.

or, buy something from Best Keyboard for Emacs