Emacs: Convert Full-Width/Half-Width Punctuations

By Xah Lee. Date: . Last updated: .

This page shows commands to convert to/from Full-Width/Half-Width characters. (全角 半角 转换)

If you type Chinese or Japanese mixed with English, then often you'll have mixed Asian/Western punctuations, and is hard to fix manually.

[see Unicode Full-Width Characters]

Convert English Chinese Punctuation

(defun xah-convert-english-chinese-punctuation (@begin @end &optional @to-direction)
  "Convert punctuation from/to English/Chinese characters.

When called interactively, do current line or selection. The conversion direction is automatically determined.

If `universal-argument' is called, ask user for change direction.

When called in lisp code, *begin *end are region begin/end positions. *to-direction must be any of the following values: 「\"chinese\"」, 「\"english\"」, 「\"auto\"」.

See also: `xah-remove-punctuation-trailing-redundant-space'.

URL `http://ergoemacs.org/emacs/elisp_convert_chinese_punctuation.html'
Version 2015-10-05"
  (interactive
   (let ($p1 $p2)
     (if (use-region-p)
         (progn
           (setq $p1 (region-beginning))
           (setq $p2 (region-end)))
       (progn
         (setq $p1 (line-beginning-position))
         (setq $p2 (line-end-position))))
     (list
      $p1
      $p2
      (if current-prefix-arg
          (ido-completing-read
           "Change to: "
           '( "english"  "chinese")
           "PREDICATE"
           "REQUIRE-MATCH")
        "auto"
        ))))
  (let (
        ($input-str (buffer-substring-no-properties @begin @end))
        ($replacePairs
         [
          [". " "。"]
          [".\n" "。\n"]
          [", " ","]
          [",\n" ",\n"]
          [": " ":"]
          ["; " ";"]
          ["? " "?"] ; no space after
          ["! " "!"]

          ;; for inside HTML
          [".</" "。</"]
          ["?</" "?</"]
          [":</" ":</"]
          [" " " "]
          ]
         ))

    (when (string= @to-direction "auto")
      (setq
       @to-direction
       (if
           (or
            (string-match " " $input-str)
            (string-match "。" $input-str)
            (string-match "," $input-str)
            (string-match "?" $input-str)
            (string-match "!" $input-str))
           "english"
         "chinese")))
    (save-excursion
      (save-restriction
        (narrow-to-region @begin @end)
        (mapc
         (lambda ($x)
           (progn
             (goto-char (point-min))
             (while (search-forward (aref $x 0) nil "noerror")
               (replace-match (aref $x 1)))))
         (cond
          ((string= @to-direction "chinese") $replacePairs)
          ((string= @to-direction "english") (mapcar (lambda (x) (vector (elt x 1) (elt x 0))) $replacePairs))
          (t (user-error "Your 3rd argument 「%s」 isn't valid" @to-direction))))))))

Remove Punctuation Trailing Redundant Spaces

Here's helpful command to remove redundant spaces after punctuation.

(defun xah-remove-punctuation-trailing-redundant-space (@begin @end)
  "Remove redundant whitespace after punctuation.
Works on current line or text selection.

When called in emacs lisp code, the *begin *end are cursor positions for region.

See also `xah-convert-english-chinese-punctuation'.

URL `http://ergoemacs.org/emacs/elisp_convert_chinese_punctuation.html'
version 2015-08-22"
  (interactive
   (if (use-region-p)
       (list (region-beginning) (region-end))
     (list (line-beginning-position) (line-end-position))))
  (require 'xah-replace-pairs)
  (xah-replace-regexp-pairs-region
   @begin @end
   [
    ;; clean up. Remove extra space.
    [" +," ","]
    [",  +" ", "]
    ["?  +" "? "]
    ["!  +" "! "]
    ["\\.  +" ". "]

    ;; fullwidth punctuations
    [", +" ","]
    ["。 +" "。"]
    [": +" ":"]
    ["? +" "?"]
    ["; +" ";"]
    ["! +" "!"]
    ["、 +" "、"]
    ]
   "FIXEDCASE" "LITERAL"))

These commands are useful for Twitter too, for saving a few character in Twitter's character limit. Because, English punctuation takes 2 char each, while Chinese version needs just one char, the space is included in the punctuation symbol.

Convert Half-Width Full-Width Characters

This command convert all English letters and digits and punctuations, from/to half-width and full-width.

[see Unicode Full-Width Characters]

(defun xah-convert-fullwidth-chars (@begin @end &optional @to-direction)
  "Convert ASCII chars to/from Unicode fullwidth version.
Works on current line or text selection.

The conversion direction is determined like this: if the command has been repeated, then toggle. Else, always do to-Unicode direction.

If `universal-argument' is called first:

 no C-u → Automatic.
 C-u → to ASCII
 C-u 1 → to ASCII
 C-u 2 → to Unicode

When called in lisp code, @begin @end are region begin/end positions. @to-direction must be any of the following values: 「\"unicode\"」, 「\"ascii\"」, 「\"auto\"」.

See also: `xah-remove-punctuation-trailing-redundant-space'.

URL `http://ergoemacs.org/emacs/elisp_convert_chinese_punctuation.html'
Version 2018-08-02"
  (interactive
   (let ($p1 $p2)
     (if (use-region-p)
         (progn
           (setq $p1 (region-beginning))
           (setq $p2 (region-end)))
       (progn
         (setq $p1 (line-beginning-position))
         (setq $p2 (line-end-position))))
     (list $p1 $p2
           (cond
            ((equal current-prefix-arg nil) "auto")
            ((equal current-prefix-arg '(4)) "ascii")
            ((equal current-prefix-arg 1) "ascii")
            ((equal current-prefix-arg 2) "unicode")
            (t "unicode")))))
  (let* (
         ($ascii-unicode-map
          [
           ["0" "0"] ["1" "1"] ["2" "2"] ["3" "3"] ["4" "4"] ["5" "5"] ["6" "6"] ["7" "7"] ["8" "8"] ["9" "9"]
           ["A" "A"] ["B" "B"] ["C" "C"] ["D" "D"] ["E" "E"] ["F" "F"] ["G" "G"] ["H" "H"] ["I" "I"] ["J" "J"] ["K" "K"] ["L" "L"] ["M" "M"] ["N" "N"] ["O" "O"] ["P" "P"] ["Q" "Q"] ["R" "R"] ["S" "S"] ["T" "T"] ["U" "U"] ["V" "V"] ["W" "W"] ["X" "X"] ["Y" "Y"] ["Z" "Z"]
           ["a" "a"] ["b" "b"] ["c" "c"] ["d" "d"] ["e" "e"] ["f" "f"] ["g" "g"] ["h" "h"] ["i" "i"] ["j" "j"] ["k" "k"] ["l" "l"] ["m" "m"] ["n" "n"] ["o" "o"] ["p" "p"] ["q" "q"] ["r" "r"] ["s" "s"] ["t" "t"] ["u" "u"] ["v" "v"] ["w" "w"] ["x" "x"] ["y" "y"] ["z" "z"]
           ["," ","] ["." "."] [":" ":"] [";" ";"] ["!" "!"] ["?" "?"] ["\"" """] ["'" "'"] ["`" "`"] ["^" "^"] ["~" "~"] ["¯" " ̄"] ["_" "_"]
           [" " " "]
           ["&" "&"] ["@" "@"] ["#" "#"] ["%" "%"] ["+" "+"] ["-" "-"] ["*" "*"] ["=" "="] ["<" "<"] [">" ">"] ["(" "("] [")" ")"] ["[" "["] ["]" "]"] ["{" "{"] ["}" "}"] ["(" "⦅"] [")" "⦆"] ["|" "|"] ["¦" "¦"] ["/" "/"] ["\\" "\"] ["¬" "¬"] ["$" "$"] ["£" "£"] ["¢" "¢"] ["₩" "₩"] ["¥" "¥"]
           ]
          )
         ($reverse-map
          (mapcar
           (lambda (x) (vector (elt x 1) (elt x 0)))
           $ascii-unicode-map))

         ($stateBefore
          (if (get 'xah-convert-fullwidth-chars 'state)
              (get 'xah-convert-fullwidth-chars 'state)
            (progn
              (put 'xah-convert-fullwidth-chars 'state 0)
              0
              )))
         ($stateAfter (if (eq $stateBefore 0) 1 0 )))

  ;"0\\|1\\|2\\|3\\|4\\|5\\|6\\|7\\|8\\|9\\|A\\|B\\|C\\|D\\|E\\|F\\|G\\|H\\|I\\|J\\|K\\|L\\|M\\|N\\|O\\|P\\|Q\\|R\\|S\\|T\\|U\\|V\\|W\\|X\\|Y\\|Z\\|a\\|b\\|c\\|d\\|e\\|f\\|g\\|h\\|i\\|j\\|k\\|l\\|m\\|n\\|o\\|p\\|q\\|r\\|s\\|t\\|u\\|v\\|w\\|x\\|y\\|z"

    ;; (message "before %s" $stateBefore)
    ;; (message "after %s" $stateAfter)
    ;; (message "@to-direction %s" @to-direction)
    ;; (message "real-this-command  %s" real-this-command)
    ;; (message "real-last-command %s" real-last-command)
    ;; (message "this-command  %s" this-command)
    ;; (message "last-command %s" last-command)

    (let ((case-fold-search nil))
      (xah-replace-pairs-region
       @begin @end
       (cond
        ((string= @to-direction "unicode") $ascii-unicode-map)
        ((string= @to-direction "ascii") $reverse-map)
        ((string= @to-direction "auto")
         (if (eq $stateBefore 0)
             $reverse-map
           $ascii-unicode-map )

         ;; 2018-08-02 this doesn't work when using smex
         ;; (if (eq last-command this-command)
         ;;     (progn
         ;;       (message "%s" "repeated")
         ;;       (if (eq $stateBefore 0)
         ;;           $reverse-map
         ;;         $ascii-unicode-map ))
         ;;   (progn
         ;;     (message "%s" "not repeated")
         ;;     $ascii-unicode-map))

         ;;

         )
        (t (user-error "Your 3rd argument 「%s」 isn't valid" @to-direction)))
       t t ))
    (put 'xah-convert-fullwidth-chars 'state $stateAfter)))

Emacs Text Transform Under Cursor

  1. Toggle Letter Case
  2. Change to Title Case
  3. Upcase Sentences
  4. Cycle Replace Space Hyphen Underscore
  5. Remove Accent Marks
  6. Escape Quotes Command
  7. Spaces to New Lines
  8. Quote Lines
  9. Change Brackets/Quotes
  10. CSS Compressor
  11. Replace Greek Letter Names to Unicode
  12. Convert Straight/Curly Quotes
  13. Convert Full-Width/Half-Width Punctuations
  14. Lines to HTML Table
Patreon me $5 patreon

Or Buy Xah Emacs Tutorial

Or buy a nice keyboard: Best Keyboards for Emacs

If you have a question, put $5 at patreon and message me.