Hour 8: Searching for Text in Multiple Files: Searching for a Definition in a Set of Files

Sams Teach Yourself Emacs in 24 Hours

Hour 8: Searching for Text in Multiple Files

Sections in this Hour:
Searching for a Definition in a Set of Files		Summary
Using grep from Within Emacs		Q&A
Keeping Points in Buffers for Several Sessions (Bookmarks)		Exercises
Keeping Points in Buffers for the Current Session

Hour 8
Searching for Text in Multiple Files

It is often necessary to search for text in several buffers. Examples of this include

Searching for a definition of a function when programming.

Searching for a tag for a given reference when writing LaTeX documents.

Searching for the location where you wrote that Linux was interesting.

Searching can be split in two categories:

In the first category, you have a set of files in which you might search for things several times. This might include the C files of the program you are working on right now, the LaTeX files of your upcoming book, and so on.

In the other category, you have a set of files in which it is likely that you won't be searching for awhile--or at least it is very likely that the set will have changed before you search in it again. This can include the standard C include files, all the files in your home directory and below, and so on.

If this split-up seems unclear to you, read on. When you have read the next two sections, you will know exactly which tool to use in a given situation. Later in this hour, you will study two different mechanisms for making bookmarks in files--bookmarks make it easy to return to specific locations. The difference between the methods is that in the first one, the bookmarks are saved between Emacs sessions, whereas in the other, they are not.

Searching for a Definition in a Set of Files

When searching for text in several files, you have two choices:

You can spend time initially to index the files, which will speed up later searching.

You can skip the initial indexing. Every search will then be slower than if you had indexed the files.

This section describes the method where the files are indexed.

Creating a TAGS File

The indexing method used by Emacs is to divide the text into two categories-- definitions and ordinary text .

Definitions are indexed and written to a file called the TAGS file (its name is TAGS), whereas ordinary text is not indexed. The meaning of definitions depends on the files involved. In programming languages such as C, Java, and Perl, definition means the definition of functions, global variables, and, eventually, #define TAGS. In LaTeX files, definition means the definition of things such as \chapter, \section, and \label.

Searching for a definition is very fast, but searching for a nondefinition is no faster than an ordinary search without an index, because of the index. It has one advantage still, which is that the set of files must only be determined once, namely at the time when the index is created.

To create an index, you must use the command etags, which comes with Emacs. If your file has a format and an extension that are known by etags, the following command should work for you:


etags files

Thus to create an index for all the C files in a directory, simply write etags *.c at your command prompt.

The etags program is usually included in the Emacs installation directory under BIN, but in the UNIX world SysArdm routinely moves binaries to /usr/local/bin.

Note - Whenever you change some of the definitions, you should rerun etags if you want subsequent TAGS searches to know about the change.

In Table 8.1, you can see the file formats that etags recognizes. If you have a file in a known format but with the wrong extension, you can use the -l option. For example, if you give your prolog files the extension .pro instead of .prolog, you can index them with the command etags -l prolog *.pro.

Table 8.1 File Formats Known by etags Program

*File Format*	*Extensions*
asm	.a, .asm, .def, .inc, .ins, .s, .sa, .src
C	.c, .h*, .cs, .hs
C++	.C, .H, .c++, .cc, .cpp, .cxx, .h++, .hh, .hpp, .hxx, .M, .pdb
COBOL	.COB, .cob
erlang	.erl, .hrl
FORTRAN	.F, .f, .f90, .for
Java	.java
LISP	.cl, .clisp, .el, .l, .lisp, .lsp, .ml
Pascal	.p, .pas
Perl	.pl, .pm
PostScript	.ps
proc	.pc, .m, .lm
Prolog	.prolog
Python	.py
Scheme	.SCM, .SM, .oak, .sch, .scheme, .scm, .sm, .ss, .t
TeX	.TeX, .bib, .clo, .cls, .ltx, .sty, .tex
yacc	.y, .ym

If, on the other hand, you have files in a format that is unknown to etags, you can still use it, but you have to supply a regular expression, which tells it what should be regarded as definitions. If you come upon this situation you should consult the manual page for etags. Or consult the info pages for Emacs and search for TAGS. Two examples are given here.

To index files written for TCL, use the following command:


etags -l none --regex='/proc[ \t]+\([^ \t]+\)/\1/' files

To index files written for HTML, use the following command:


etags -l none --regex='/.*/\1/' --regex='/.*\([^<]+\)/\1/' files

Searching Using TAGS

As mentioned in the previous section, Emacs has two commands for searching using TAGS files:

One for searching for definitions

One for an ordinary search over several files

Note - To be completely honest, the search is, in fact, a regular expression search. As mentioned before, however, with ordinary search strings (that is, strings without characters such as *, ?, [, ], and \), this fact will not affect you.

Searching for Definitions

When you program, you often need to find a function definition for the function at point. This task teaches you how to do that using the TAGS facilities. Follow these steps:

Create a TAGS file for the files of the given program, as described in the previous section. Place point at the function name for which you would like to see the definition and press M-. (find-tag) (see Figure 8.1).

Figure 8.1
Press M-. to search for a definition. Emacs will ask for the name of the definition and suggest the word located at point as the default.

: 2. Emacs will now suggest the function at which point is located. If this is the function you are searching for, simply press Enter. Otherwise, type the name of the function and press Enter.

Tip - In this buffer you can use M-p and M-n to travel through the list of names you have searched for earlier.

: 3. If this is the first time in this Emacs session that you use a command related to the TAGS mechanism, Emacs will ask you for a TAGS file, which defaults to a file named TAGS in the same directory from which you asked for a TAGS search (see Figure 8.2).

Figure 8.2
Emacs asks for a file with the TAGS definitions. As the default, it suggests the file named TAGS in the buffer's current working directory.

Caution - If a TAGS file exists in the same directory as your current file, XEmacs does not ask for a file; instead, it uses the current TAGS file. If this isn't the file you intend to use, you can change it using the command visit-tags-table.

Note - You can, if you want, attach M-x in front of accessible Emacs commands (such as the one in the preceding Caution) that are not bound to keys.

When you tell Emacs which TAGS file to use, it searches for the definition, opens the file that contains the definition in your current window, and jumps to the line with the definition (see Figure 8.3).

Figure 8.3
When you tell Emacs which definition to jump to, it replaces your current buffer with this buffer. That is, it loads the new buffer into your current window.

Searching for definitions using TAGS files is useful when you are programming and you want to see either the definition of a subfunction or exactly which arguments the function takes. After you find the location, you might want to return to the location where you came from. For that purpose the bookmarks described in the section, "Managing Bookmarks," are very useful.

Tip - As mentioned previously, M-. (find-tag) replaces the buffer in the current window with the buffer containing the definition. Two other commands exist that open the file with the definition in another window or another frame. These commands are C-x 4 (find-tag-other-window) and C-x 5.(find-tag-other-frame). Note the bindings contain three keys: C-x, either 4 or 5, and finally a period (.).

If you press C-u M-.instead of M-., Emacs searches for another definition that matches your previous search. This can be useful when you are searching for a function that can be defined in several classes (such as in C++ or Java).

Searching for Nondefinitions

If you are searching for text that is not a definition (for example, the use of a function), you can use the command tags-search, which is not bound to any keys by default. (That might be a good candidate for your function keys, right?) This starts a search through all the files defined in the TAGS file. The search starts from your current location in the given file. When Emacs has found an occurrence, it lets you do whatever editing you want to do at this point. Later (or eventually at once), you can continue your search by pressing M-, (tags-loop-continue)--that is, Meta + comma.

Listing and Completing Definitions

If you have the name of a definition on the tip of your tongue but can't find it, the function tags-apropos might help you. It asks for a regular expression and lists all names of definitions that match this regular expression. If, on the other hand, you are perfectly aware of the name but it is long and might even contain some capitalization, you can use the command complete-tag, which by default is not bound to any keys. Some major modes can, however, bind this to M-TAB. This command searches the TAGS table for a definition of the word located before point and searches for a completion of it using the definitions in the TAGS table.

Search-and-Replace Using TAGS

The main thing that makes the TAGS feature worth learning is its capability to do search-and-replace over several files (without having to start it in each file). The function M-x, tags-query-replace, is the one that does search-and-replace in several files. It is not bound to any key by default in Emacs's standard setup, so it might be a good idea to bind it to a function key. It works much like the ordinary query-replace function described in Hour 7, "Searching for Text in a Buffer." The only difference is that it does search-and-replace in all the files mentioned in the TAGS file. For each match you have the same options as for ordinary search. For more information, see the section "Options when Replacing" in Hour 7.

You should be aware of three things when searching and replacing using TAGS:

The function of the ^ key (go back to previous replacements) is limited to the current file. That means that if you replace a number of matches in file A, and advance to file B, you cannot go back to matches in file A. Don't panic--this is not a big issue, because you can always change the buffer, and undo with C-_ (that's Control + underscore).

The function of the ! key (replace the rest of the matches) is limited only to the current buffer. That means that pressing ! will not replace unconditionally in the rest of the buffers, but only in the rest of the current buffer.

If you abort a tags-query-replace operation, you can continue it from your current position by pressing M-, (tags-loop-continue). This is very useful if you, for example, have a file that you want to skip during tags-query-replace.

Note - Search-and-replace using TAGS has nothing to do with the definitions in the files, thus you can use it to do search-and-replace in any file that Emacs can read!

Caution - You can't use tags-replace if your files are in version control, such as RCS or CVS. Etags doesn't know how to check out files before replacing. If you don't know what version control is, or what RCS, CSSC, or CVS are, don't bother.