Filter 1
Filter 1
The sample database shown on Page 434 can be found in http://www.mhhe.com/engcs/compsci/d as/data.mhtml grep is a filter used to search a file for a pattern. It scans a file for the occurrence of a pattern and, depending on the options used, displays:
Lines containing the selected pattern.
Its syntax treats the first argument as the pattern and the rest as filenames: grep options pattern filename(s) $ grep sales emp.lst When you use multiword strings as the pattern, you must quote the pattern.
grep Options
$ grep neil obryan emp.lst The -c (count) option counts the occurrences. $ grep -c directory emp*.lst The -n (number) option can be used to display the line numbers containing the pattern. $ grep -n marketing emp.lst
grep Options
The -l (list) option displays only the names of files where a pattern has been found. $ grep -l manager *.lst The -i (ignore) option makes the match case-insensitive. To look for a pattern that begins with a hyphen, use the -e option. $ grep e -mtime /var/spool/cron/crontabs/*
grep Options
In Linux, you can use the -e option to match multiple patterns. $ grep -e woodhouse -e wood emp.lst A regular expression is an ambiguous expression formed with some special and ordinary characters, which is expanded by a command to match more than one string. grep uses a regular expression to match a group of similar patterns.
Regular Expressions
A regular expression uses a character class that encloses a group of characters within a pair of rectangular brackets []. $ grep wo[od][de]house emp.lst Regular expressions use the ^(caret) to negate the character class, while the shell use ! (bang). A single nonalphabetic string is represented by [^a-zA-Z].
Regular Expressions
The * (asterisk) matches the zero or more occurrences of the preceding character. $ grep wilco[cx]k*s* emp.lst A . (dot) matches a single character. The shell use the ? character to indicate that. The dot along with the * (.*) signifies any number of characters, or none. $ grep p.*woodhouse emp.lst
Regular Expressions
A pattern can be matched at the beginning of a line with a ^, and at the end with a $. Because it is the command that interprets these characters, a regular expression should be quoted to prevent the shell from interfering. $ grep ^2 emp.lst The . and * lose their meanings when placed inside the character class. Then you need to escape these characters.
egrep extends greps capabilities. It uses | to delimit multiple patterns. $ egrep woodhouse|woodcock emp.lst fgrep accepts only fixed strings and is faster than the other two. Both commands support the -f (file) option to take such patterns from the file. fgrep f pat.lst emp.lst