Emacs Lisp Batch Text processing: Grep Find Replace Variations
This page shows emacs lisp scripts that do variations of find/replace string. For example, i need a script that reports the position of a given string for 5 thousand files. Another example: replace all HTML page's
<h1> tag text from its
<title> tag text.
Problem: Report String Position
I need to know if a particular string happens in beginning of file or near the end. Ι need to know this for about 5k files in a directory.
;; -*- coding: utf-8 -*- ;; 2011-03-21 ;; report the line number of a occurrences of string, of a given dir (setq inputDir "~/web/ergoemacs_org/emacs/" ) ;; add a ending slash if not there (when (not (string= "/" (substring inputDir -1))) (setq inputDir (concat inputDir "/"))) (defun my-process-file (fPath) "process the file at fullpath fPath …" (let (myBuffer (ii 0) searchStr) (when (not (string-match "/xx" fPath)) ; skip dir starting with xx (setq myBuffer (get-buffer-create " myTemp")) (set-buffer myBuffer) (insert-file-contents fPath nil nil nil t) (setq case-fold-search nil) ; NOTE: remember to set case sensitivity here (setq searchStr "<style>" ) (goto-char 1) (while (search-forward searchStr nil t) ; NOTE: for regex, use re-search-forward (princ (format "this many: %d %s\n" (line-number-at-pos (point)) fPath))) (kill-buffer myBuffer)))) (require 'find-lisp) (let (outputBuffer) (setq outputBuffer "*xah occur output*" ) (with-output-to-temp-buffer outputBuffer (mapc 'my-process-file (find-lisp-find-files inputDir "\\.html$")) (princ "Done deal!")))
Modify the “inputDir” and “searchStr” above and test it on your own machine.
For explanation of this code, see: How to Write grep in Emacs Lisp.
Problem 2: Fix HTML “TITLE” and “H1” Tags
Today, while i working on my website, i noticed some HTML files are missing a “H1” header tag. While in another directory, i wish to replace all “TITLE” tag content by the one from “H1” tag.
So, i need a script that fix these tag's texts.
Here's a function that gets a file's “title” tag text.
(defun xah-html-get-html-file-title (fname) "Return FNAME <title> tag's text. Assumes that the file contains the string “<title>…</title>”." (with-temp-buffer (insert-file-contents fname nil nil nil t) (goto-char 1) (buffer-substring-no-properties (search-forward "<title>") (- (search-forward "</title>") 8))))
I also need to get the “H1” tag text. So i just quickly did a copy-paste coding:
(defun xah-html-get-html-file-h1 (fname) "Return fname <h1> tag's text. Assumes that the file contains the string “<h1>…</h1>”." (with-temp-buffer (insert-file-contents fname nil nil nil t) (goto-char 1) (buffer-substring-no-properties (search-forward "<h1>") (- (search-forward "</h1>") 5))))
It's not efficient to open file twice to get “title” and “h1” texts, but that's ok, because my whole script will finish running in a few seconds anyway and this is just one-time use.
Now, here's the code i wrote quickly to fix the tags:
;; -*- coding: utf-8 -*- ;; 2011-03-20 ;; change title to h1 tag's text in “Time Machine” pages ;; ;; for each HTML page in 〔~/web/xahlee_org/p/time_machine/〕 ;; if the title tag and h1 tag text differ, make the title use h1's text (setq inputDir "~/web/xahlee_org/p/time_machine/" ) ; dir must end with a slash (defun my-process-file (fPath) "process the file at fullpath fPath …" (let ( titleText h1Text p1 p2) (setq h1Text (get-html-file-h1-text fPath)) (setq titleText (get-html-file-title fPath)) (if (equal h1Text titleText) nil (progn (find-file fPath ) (goto-char 1) (search-forward "<title>" ) (setq p1 (point) ) (search-forward "</title>" ) (backward-char 8) (setq p2 (point) ) (delete-region p1 p2 ) (insert h1Text) (print fPath) )) )) (require 'find-lisp) (let (outputBuffer) (setq outputBuffer "*process time machine output*" ) (with-output-to-temp-buffer outputBuffer (mapc 'my-process-file (find-lisp-find-files inputDir "\\.html$")) (princ "Done deal!") ) )
In this script, i didn't include code to save the changed file. This way, i can do some manual verification after the script has run. When i want them all saved, i just call
ibuffer and type 3 keys 【* u S】 to have all of them saved, and 【D y】 closes them all.
[see Emacs: List Buffers]