Emacs: Remove Accent Marks

By Xah Lee. Date: . Last updated: .

Here's a emacs command that removes accent marks, or, convert some Unicode characters into ASCII. (aka Zap Gremlins)

For example:

(defun xah-asciify-text (&optional *begin *end)
  "Change European language characters into equivalent ASCII ones, e.g. “café” ⇒ “cafe”.
When called interactively, work on current line or text selection.

URL `http://ergoemacs.org/emacs/emacs_zap_gremlins.html'
Version 2016-07-12"
  (let ((-charMap
          ["á\\|à\\|â\\|ä\\|ā\\|ǎ\\|ã\\|å\\|ą" "a"]
          ["é\\|è\\|ê\\|ë\\|ē\\|ě\\|ę" "e"]
          ["í\\|ì\\|î\\|ï\\|ī\\|ǐ" "i"]
          ["ó\\|ò\\|ô\\|ö\\|õ\\|ǒ\\|ø\\|ō" "o"]
          ["ú\\|ù\\|û\\|ü\\|ū"     "u"]
          ["Ý\\|ý\\|ÿ"     "y"]
          ["ç\\|č\\|ć" "c"]
          ["ď\\|ð" "d"]
          ["ľ\\|ĺ\\|ł" "l"]
          ["ñ\\|ň\\|ń" "n"]
          ["þ" "th"]
          ["ß" "ss"]
          ["æ" "ae"]
          ["š\\|ś" "s"]
          ["ť" "t"]
          ["ř\\|ŕ" "r"]
          ["ž\\|ź\\|ż" "z"]
        -begin -end
    (if (null *begin)
        (if (use-region-p)
              (setq -begin (region-beginning))
              (setq -end (region-end)))
            (setq -begin (line-beginning-position))
            (setq -end (line-end-position))))
        (setq -begin *begin)
        (setq -end *end)))
    (let ((case-fold-search t))
        (narrow-to-region -begin -end)
         (lambda (-pair)
           (goto-char (point-min))
           (while (re-search-forward (elt -pair 0) (point-max) t)
             (replace-match (elt -pair 1))))
(defun xah-asciify-string (*string)
  "Returns a new string. European language chars are changed ot ASCII ones e.g. “café” ⇒ “cafe”.
See `xah-asciify-text'
Version 2015-06-08"
      (insert *string)
      (xah-asciify-text (point-min) (point-max))

〔►see Accent Marks: Trema, Umlaut, Macron, Circumflex, and All That

( thanks to robert_nagy for adding chars)

Accumulator vs Parallel Programing

This problem makes a good parallel programing exercise. See: Parallel Programing Exercise: asciify-string.

Alternative Solution with “iconv” or perl

Yuri Khan and Teemu Likonen suggested using the “iconv” shell command. See man iconv. Here's Teemu's code.

(defun asciify-string (string)
"Convert STRING to ASCII string.
For example:
“passé” becomes “passe”"
;; Code originally by Teemu Likonen
    (insert string)
    (call-process-region (point-min) (point-max) "iconv" t t nil "--to-code=ASCII//TRANSLIT")
    (buffer-substring-no-properties (point-min) (point-max))))

Julian Bradfield suggested Perl. Here's his one-liner, it removes chars with accent marks.

perl -e 'use encoding utf8; use Unicode::Normalize; while ( <> ) { $_ = NFKD($_); s/\pM//g; print; }'

Source groups.google.com

Though, it would be nice to have a pure elisp solution, because “iconv” is not in Windows or Mac OS X as of .

Text Transform Topic

  1. Emacs: Toggle Letter Case
  2. Emacs: Change to Title Case
  3. Emacs: Upcase Sentences
  4. Emacs: Cycle Replace Space Hyphen Underscore
  5. Emacs: Remove Accent Marks
  6. Emacs: Escape Quotes Command
  7. Emacs: Quote Lines
  8. Emacs: Change Brackets and Quotes
  9. Emacs: CSS Compressor
  10. Emacs: Replace Greek Letter Names to Unicode
  11. Emacs: Convert Straight/Curly Quotes
  12. Emacs: Convert English/Chinese Punctuations
  13. Emacs: Lines to HTML Table
Like it? Buy Xah Emacs Tutorial. Thanks.