Elisp: Problems of thing-at-point

By Xah Lee. Date: . Last updated: .

This page discuss some problems of the function thing-at-point.

For tutorial, see: Elisp: Using thing-at-point

Behavior Dependents on Syntax Table

When you call (thing-at-point 'word), what string you get exactly depends on the syntax table of the current buffer.

For example, if you always want your “symbol” to mean any alphanumeric plus hyphen -, you can't rely on thing-at-point to give you the right thing, because it may include low line _, or may not include hyphen, or may include apostrophe ', depending on the current major mode's syntax table.

You might think that depending on syntax table is great, because it provides a abstract layer that allows different languages to define its own syntactic units. But in practice, computer languages, or other arbitrary modes such as dired, irc, etc, they do not have concepts that neatly fit into {“symbol”, “word”}

This problem also applies for “things” of 'sentence, 'paragraph and all others.

Here's a test.

(defun f ()
  "print current word."
  (interactive)
  (message "%s" (thing-at-point 'symbol)))

Eval the code. 〔►see How to Evaluate Emacs Lisp Code

Then, put the following in a buffer:

aa_bb-cc

Put your cursor between “b”.

This is so as of emacs 24.4.1.

Inconsistent Behavior for 「'line」

When you call (thing-at-point 'line), it will return the line with the newline character. However, if the line is at the end of buffer with no newline , then no newline is included.

This means you have to write extra code to check the newline char.

Here's a test.

(defun f ()
  "print current line."
  (interactive)
  (message "[%s]" (thing-at-point 'line)))

Then, put the following in a buffer:

this line
last line

Make sure there's no newline char at the last line.

Then, call “f”, on the lines.

This is so as of emacs 24.4.1.

If you want to get the line, better is:

(buffer-substring-no-properties
 (line-beginning-position)
 (line-end-position))

〔►see Elisp: Functions on Line

thing-at-point 'url problem (fixed as of emacs 24.4)

What thing-at-point returns is not necessarily the exact text under cursor. When the URL you want to grab does not start with “http”, it adds it.

This is FIXED as of emacs 24.4.1.

For example, if the text under cursor is

example_org/emacs/elisp.html

it'll return

http://example_org/emacs/elisp.html

This is very annoying.

Sometimes i just want to grab a sequence of chars that may be file path or URL, in a HTML file text such as href="my_cat.html" or href="http://example/my_cat.html". You do not know which in advance, but after you got the thing you can test it by checking for “http” or other things. But if you use thing-at-point with 'filename or 'url, it does things to the string that you didn't expect.

(thing-at-point 'url) gets confused if the URL contains parenthesis. e.g. http://en.wikipedia.org/wiki/Oz_(programming_language). (this is fixed in emacs 23.2. 〔►see Emacs 23.2 Features (released 2010-05)〕 )

Here's test code.

(defun f ()
  "print `thing-at-point' url"
  (interactive)
  (message "[%s]" (thing-at-point 'url)))

Get Text Selection or Unit at Current Cursor Position

Starting with emacs 23.x, text selection is highlighted by default. (this means: transient-mark-mode is on by default. 〔►see Elisp: Region, Active Region〕) There's a new user interface idiom. When there is a text selection, the command will act on the text selection. Otherwise, the command acts on the current word, line, paragraph, buffer, …, whichever is appropriate for the command. This is great because users don't have to think about whether to call the “-region” version of the command. 〔►see Emacs 23.1 New Features (released 2009-07)

When you write a command to do this, the code typically looks like this:

;; get current selection or word
(let (bds p1 p2 inputStr resultStr)

  ;; get boundary
  (if (use-region-p)
      (setq bds (cons (region-beginning) (region-end) ))
      (setq bds (bounds-of-thing-at-point 'word)) )
  (setq p1 (car bds) )
  (setq p2 (cdr bds) )

  ;; grab the string
  (setq inputStr (buffer-substring-no-properties p1 p2)  )

  ;; do something with inputStr here

  (delete-region p1 p2 ) ; delete the region
  (insert resultStr) ; insert new string
 )

It takes about 6 lines to get the boundary and the string. If you are grabbing line, then you need few more lines to check EOL.

Alternative Solution: “xah-get-thing.el”

Because i need to grab the text so often, i got tired of repeatedly writing these 10 or so lines. I wrote a function that does this. See: Emacs: xah-get-thing.el.

Liket it? Put $5 at patreon. Or Buy Xah Emacs Tutorial. Thanks.