Emacs: Convert English/Chinese Punctuations

By Xah Lee. Date: . Last updated: .

This page shows a emacs lisp code to convert to/from Chinese/English punctuations.

If you type Chinese or Japanese mixed with English, then often you'll have mixed Asian/Western punctuations, and is laborious to fix manually. Here's a code that will help fix it.

This is used to convert punctuation from English to/from Asian's full-width form. 〔►see Unicode Full-Width Characters〕 Example:

(defun xah-convert-english-chinese-punctuation (*begin *end &optional *to-direction)
  "Convert punctuation from/to English/Chinese characters.

When called interactively, do current line or selection. The conversion direction is automatically determined.

If `universal-argument' is called, ask user for change direction.

When called in lisp code, *begin *end are region begin/end positions. *to-direction must be any of the following values: 「\"chinese\"」, 「\"english\"」, 「\"auto\"」.

See also: `xah-remove-punctuation-trailing-redundant-space'.

URL `http://ergoemacs.org/emacs/elisp_convert_chinese_punctuation.html'
Version 2015-10-05"
  (interactive
   (let (-p1 -p2)
     (if (use-region-p)
         (progn
           (setq -p1 (region-beginning))
           (setq -p2 (region-end)))
       (progn
         (setq -p1 (line-beginning-position))
         (setq -p2 (line-end-position))))
     (list
      -p1
      -p2
      (if current-prefix-arg
          (ido-completing-read
           "Change to: "
           '( "english"  "chinese")
           "PREDICATE"
           "REQUIRE-MATCH")
        "auto"
        ))))
  (let (
        (-input-str (buffer-substring-no-properties *begin *end))
        (-replacePairs
         [
          [". " "。"]
          [".\n" "。\n"]
          [", " ","]
          [",\n" ",\n"]
          [": " ":"]
          ["; " ";"]
          ["? " "?"] ; no space after
          ["! " "!"]

          ;; for inside HTML
          [".</" "。</"]
          ["?</" "?</"]
          [":</" ":</"]
          [" " " "]
          ]
         ))

    (when (string= *to-direction "auto")
      (setq
       *to-direction
       (if
           (or
            (string-match " " -input-str)
            (string-match "。" -input-str)
            (string-match "," -input-str)
            (string-match "?" -input-str)
            (string-match "!" -input-str))
           "english"
         "chinese")))
    (save-excursion
      (save-restriction
        (narrow-to-region *begin *end)
        (mapc
         (lambda (-x)
           (progn
             (goto-char (point-min))
             (while (search-forward (aref -x 0) nil "noerror")
               (replace-match (aref -x 1)))))
         (cond
          ((string= *to-direction "chinese") -replacePairs)
          ((string= *to-direction "english") (mapcar (lambda (x) (vector (elt x 1) (elt x 0))) -replacePairs))
          (t (user-error "Your 3rd argument 「%s」 isn't valid" *to-direction))))))))

Remove Punctuation Trailing Redundant Spaces

Here's helpful command to remove redundant spaces after punctuation.

(defun xah-remove-punctuation-trailing-redundant-space (*begin *end)
  "Remove redundant whitespace after punctuation.
Works on current line or text selection.

When called in emacs lisp code, the *begin *end are cursor positions for region.

See also `xah-convert-english-chinese-punctuation'.

URL `http://ergoemacs.org/emacs/elisp_convert_chinese_punctuation.html'
version 2015-08-22"
  (interactive
   (if (use-region-p)
       (list (region-beginning) (region-end))
     (list (line-beginning-position) (line-end-position))))
  (require 'xah-replace-pairs)
  (xah-replace-regexp-pairs-region
   *begin *end
   [
    ;; clean up. Remove extra space.
    [" +," ","]
    [",  +" ", "]
    ["?  +" "? "]
    ["!  +" "! "]
    ["\\.  +" ". "]

    ;; fullwidth punctuations
    [", +" ","]
    ["。 +" "。"]
    [": +" ":"]
    ["? +" "?"]
    ["; +" ";"]
    ["! +" "!"]
    ["、 +" "、"]
    ]
   "FIXEDCASE" "LITERAL"))

These commands are useful for Twitter too, for saving a few character in Twitter's character limit. Because, English punctuation takes 2 char each, while Chinese version needs just one char, the space is included in the punctuation symbol.

Text Transform Topic

  1. Emacs: Toggle Letter Case
  2. Emacs: Change to Title Case
  3. Emacs: Upcase Sentences
  4. Emacs: Cycle Replace Space Hyphen Underscore
  5. Emacs: Remove Accent Marks
  6. Emacs: Escape Quotes Command
  7. Emacs: Quote Lines
  8. Emacs: Change Brackets () {} [] in Region
  9. Emacs: CSS Compressor
  10. Emacs: Replace Greek Letter Names to Unicode
  11. Emacs: Convert Straight/Curly Quotes
  12. Emacs: Convert English/Chinese Punctuations
  13. Emacs: Lines to HTML Table
Like it? Buy Xah Emacs Tutorial. Thanks.

or, buy something from Best Keyboard for Emacs