Elisp: Create Sitemap

By Xah Lee. Date: . Last updated: .

This page shows how to use emacs lisp to create a sitemap.


Write a elisp script to generate a sitemap. That is: create a file of sitemap format that lists all files in a directory.


A sitemap is a XML file that lists URLs of all files in a website for web crawlers to crawl. (See: www.sitemaps.org )

A sitemap file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">



  1. The file can have many <url>…</url> item.
  2. Each <url> container represent a file and other info.
  3. The <loc> is a URL of the file.
  4. The <lastmod>, <changefreq>, <priority> are optional.
  5. A sitemap file can list a max of 50k URLs.

The purpose of sitemap file is for web crawlers to easily know all files that exist on your site.


The general plan is very simple. Here's one way to do it.

  1. Create a new file, insert XML header tags.
  2. Traverse the web root dir. For each file, determine whether it should be listed in the sitemap.
  3. If so, generate the proper URL tag and insert it into the new file.
  4. When done visiting files, insert the XML footer tags. Save the file.
  5. Optionally, gzip the file. Done.

First, define some parameters for the program.

;; full path to web's doc root. Must end in a slash.
(setq webroot "/Users/xah/web/")

;; file name of sitemap file, relative to webroot, without “.xml” suffix.
(setq sitemapFileName "sitemap")

;; gzip it or not. t for true, nil for false.
(setq gzip-it-p t)

If a sitemap file already exist, you probably want to delete it or back it up and create a new one, because sitemap needs to be generated regurlary when you have new files on the site.

; rename file to backup~ if already exists
(let (f1 f2)
  (setq f1 (concat webroot sitemapFileName ".xml"))
  (setq f2 (concat f1 ".gz"))
  (when (file-exists-p f1)
    (rename-file f1 (concat f1 "~") "OK-IF-ALREADY-EXISTS")
  (when (file-exists-p f2)
    (rename-file f2 (concat f2 "~") "OK-IF-ALREADY-EXISTS")

The next step, is to open a buffer sitemapBuf, insert the sitemap header tags, then, for each file in the web dir, insert its URL into the sitemapBuf, then add the ending tags, save, then done. Here's the code:

;; filePath is the full path to the sitemap file
;; sitemapBuf is the buffer of the sitemap file

(let (filePath sitemapBuf)
  (setq filePath (concat webroot sitemapFileName ".xml"))

;; open file and save a handle to the buffer
  (setq sitemapBuf (find-file filePath))

;; insert header tags
  (insert "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">

;; for each file in my site, insert its url
  (require 'find-lisp)
   (lambda (x) (my-process-file x sitemapBuf))
   (find-lisp-find-files webroot "\\.html$"))

;; insert ending tag
  (insert "</urlset>")

;; some post processing to add some optional tags
  (goto-char 1)
  (search-forward "http://ergoemacs.org/emacs/blog.html</loc>")
  (insert "<changefreq>daily</changefreq>")


;; gzip it
  (when gzip-it-p
    (shell-command (concat "gzip " filePath))

In the above, first we generate the full path to the sitemap file to be created. The full path is saved as string in “filePath”. Then we open the file, effective creating a new buffer. The buffer instance is saved as the variable sitemapBuf. (note: “buffer” is a elisp data type, or a instance of the data type. Normally, it also mean the buffer's content or a file. (info "(elisp) Buffers") )

find-lisp-find-files line returns a list of full paths of all HTML files.

mapc maps a function to each element of the list. The lambda line is the function that will be applied to each full path.

So, for example, if a element is ~/web/emacs/emacs.html, then the lambda function will get that as argument, and execute (my-process-file "~/web/emacs/emacs.html" sitemapBuf).

The “my-process-file” is a function that takes a file full path and a buffer. So that, it can open the file and see whether the file should be added to the sitemap file. If so, it will add to the sitemapBuf buffer.

“my-process-file” is defined this way:

(defun my-process-file (fPath destBuff)
  "Process the file at fullpath FPATH.
Write result to buffer DESTBUFF."
  (when (not (string-match "/xx" fPath)) ; dir/file starting with xx are temp files
      (insert-file-contents fPath nil nil nil t)
      (goto-char 1)
      (when (not (search-forward "<meta http-equiv=\"refresh\"" nil "noerror"))
        (with-current-buffer destBuff
          (insert "<url><loc>")
          (insert (concat "http://" domainName "/" (substring fPath (length webroot))))
          (insert "</loc></url>\n") )) ) ) )

It takes 2 arguments. The fPath is the path to a HTML file, and destBuff is the buffer holding the sitemap file.

First it checks if the file path contains any “/xx”. On my website, file names starting with “xx” is meant to be temp files. So, if a file or dir starts with “/xx”, then skip it.

Otherwise, open the file and check if the file contains a HTML meta redirect tag. Google's webmaster guide says Google doesn't like URL in sitemap that points to a file that redirects with a HTML meta tag. So, if the HTML file is a redirect, then don't generate a sitemap URL for it.

Finally, the code calls (with-current-buffer destBuff …) to insert the proper URL tag into the sitemap buffer.

The function (with-current-buffer ‹a buffer› ‹code›) will temporarily make ‹a buffer› the current buffer and execute ‹code›. When the execution is done, the current buffer returns to whatever it was.

Complete Code

Here's the full code. It is slightly different from the tutorial above. It's optimized, covers 5k files in 3 seconds.

You can either run it in a buffer by Alt+x eval-buffer or in shell by emacs --script sitemap.el.

;; -*- coding: utf-8; lexical-binding: t; -*-
;; 2018-09-04

(require 'seq)

(setq xah-web-root-path "/Users/xah/web/" )

(defvar xahsite-external-docs nil "A vector of dir paths.")
(setq  xahsite-external-docs

(defun xahsite-generate-sitemap (@domain-name)
  "Generate a sitemap.xml.gz file of xahsite at doc root.
@domain-name must match a existing one.
Version 2018-09-04"
   (list (ido-completing-read "choose:" '( "ergoemacs.org" "wordyenglish.com" "xaharts.org" "xahlee.info" "xahlee.org" "xahmusic.org" "xahsl.org" ))))
  (let (
        ($sitemapFileName "sitemap.xml" )
        ($websiteDocRootPath (concat xah-web-root-path (replace-regexp-in-string "\\." "_" @domain-name "FIXEDCASE" "LITERAL") "/")))
    ;; (print (concat "begin: " (format-time-string "%Y-%m-%dT%T")))
    (let (
          ($filePath (concat $websiteDocRootPath $sitemapFileName ))
          ($sitemapBuffer (generate-new-buffer "sitemapbuff")))
      (with-current-buffer $sitemapBuffer
        (set-buffer-file-coding-system 'unix)
        (insert "<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\">
       (lambda ($f)
         (setq $pageMoved-p nil)
         (when (not (or
                     (string-match "/xx" $f) ; ; dir/file starting with xx are not public
                     (string-match "403error.html" $f)
                     (string-match "404error.html" $f)))
             (insert-file-contents $f nil 0 100)
             (when (search-forward "page_moved_64598" nil t)
               (setq $pageMoved-p t)))
           (when (not $pageMoved-p)
             (with-current-buffer $sitemapBuffer
               (insert "<url><loc>"
                       "http://" @domain-name "/" (substring $f (length $websiteDocRootPath))
        (lambda (path)
          (not (seq-some
                (lambda (x) (string-match x path))
        (directory-files-recursively $websiteDocRootPath "\\.html$" )))
      (with-current-buffer $sitemapBuffer
        (insert "</urlset>")
        (write-region (point-min) (point-max) $filePath nil 3)
        (kill-buffer ))
      (find-file $filePath)
    ;; (print (concat "done: " (format-time-string "%Y-%m-%dT%T")))

(defun xahsite-generate-sitemap-all ()
  "do all
  (require 'find-lisp)
  (xahsite-generate-sitemap "ergoemacs.org" )
  (xahsite-generate-sitemap "wordyenglish.com" )
  (xahsite-generate-sitemap "xaharts.org" )
  (xahsite-generate-sitemap "xahlee.info" )
  (xahsite-generate-sitemap "xahlee.org" )
  (xahsite-generate-sitemap "xahmusic.org" )
  (xahsite-generate-sitemap "xahsl.org"  ))
Patreon me $5 patreon

Or Buy Xah Emacs Tutorial

Or buy a nice keyboard: Best Keyboards for Emacs

If you have a question, put $5 at patreon and message me.