Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

vi(1)

ex(1)

ex(1)

ed(1)

ed(1)

sed(1V)

awk(1)

sh(1)

GREP(1V)  —  USER COMMANDS

NAME

grep, egrep, fgrep − search a file for a pattern

SYNOPSIS

grep [ −v ] [ −c ] [ −l ] [ −n ] [ −b ] [ −i ] [ −s ] [ −h ] [ −w ] [ −e ] expression [ file ... ]

egrep [ −v ] [ −c ] [ −l ] [ −n ] [ −b ] [ −i ] [ −s ] [ −h ] [ −e expression ] [ −f file ]

[ ­expression ] [ file ... ]

fgrep [ −v ] [ −x ] [ −c ] [ −l ] [ −n ] [ −b ] [ −i ] [ −s ] [ −h ]

[ −e string ] [ −f file ] [ string ] [ file ... ]

SYSTEM V SYNOPSIS

grep [ −v ] [ −c ] [ −l ] [ −n ] [ −b ] [ −i ] [ −s ] expression [ file ... ]

DESCRIPTION

Commands of the grep family search the input files (standard input default) for lines matching a pattern. Normally, each line found is copied to the standard output. Grep patterns are limited regular expressions in the style of ed(1). Egrep patterns are full regular expressions including alternation.  Fgrep patterns are fixed strings — no regular expression metacharacters are supported. 

In general, egrep is the fastest of these programs. 

Take care when using the characters $, ∗, [, ^, │, (, ), and \ in the expression, as these characters are also meaningful to the Shell.  It is safest to enclose the entire expression argument in single quotes ´...´. 

When any of the grep utilities is applied to more than one input file, the name of the file is displayed preceding each line which matches the pattern.  The filename is not displayed when processing a single file, so if you actually want the filename to appear, use /dev/null as a second file in the list. 

OPTIONS

−v Invert the search to only display lines that do not match. 

−x Display only those lines which match exactly — that is, only lines which match in their entirety (fgrep only). 

−c Display a count of matching lines rather than displaying the lines which match. 

−l List only the names of files with matching lines (once) separated by newlines. 

−n Precede each line by its relative line number in the file. 

−b Precede each line by the block number on which it was found.  This is sometimes useful in locating disk block numbers by context. 

−i Ignore the case of letters in making comparisons — that is, upper and lower case are considered identical. 

−s Work silently, that is, display nothing except error messages.  This is useful for checking the error status. 

−h Do not display filenames. 

−w search for the expression as a word as if surrounded by \< and \>.  grep only.

−e expression
Same as a simple expression argument, but useful when the expression begins with a −. 

−f file Take the regular expression (egrep) or a list of strings separated by newlines (fgrep) from file.

SYSTEM V OPTIONS

The System V version of grep does not recognize the −h, −w, or −e options.  The −s option indicates that error messages for nonexistent or unreadable files should be suppressed, not that all messages should be suppressed. 

REGULAR EXPRESSIONS

The following one-character regular expressions match a single character:

cAn ordinary character (not one of the special characters discussed below) is a one-character regular expression that matches that character. 

\cA backslash (\) followed by any special character is a one-character regular expression that matches the special character itself.  The special characters are:

a.  ., ∗, [, and \ (period, asterisk, left square bracket, and backslash, respectively), which are always special, except when they appear within square brackets ([]).

b.  ^ (caret or circumflex), which is special at the beginning of an entire regular expression, or when it immediately follows the left of a pair of square brackets ([]). 

c.  $ (currency symbol), which is special at the end of an entire regular expression. 

.A period (.) is a one-character regular expression that matches any character except newline. 

[string]
A non-empty string of characters enclosed in square brackets is a one-character regular expression that matches any one character in that string.  If, however, the first character of the string is a circumflex (^), the one-character regular expression matches any character except newline and the remaining characters in the string.  The ^ has this special meaning only if it occurs first in the string.  The minus (−) may be used to indicate a range of consecutive ASCII characters; for example, [0−9] is equivalent to [0123456789].  The − loses this special meaning if it occurs first (after an initial ^, if any) or last in the string.  The right square bracket (]) does not terminate such a string when it is the first character within it (after an initial ^, if any); e.g., []a−f] matches either a right square bracket (]) or one of the letters a through f inclusive.  The four characters ., ∗, [, and \ stand for themselves within such a string of characters. 

The following rules may be used to construct regular expressions:

∗A one-character regular expression followed by an asterisk (∗) is a regular expression that matches zero or more occurrences of the one-character regular expression.  If there is any choice, the longest leftmost string that permits a match is chosen. 

\(
A regular expression enclosed between the character sequences \( and \) matches whatever the unadorned regular expression matches.  (grep only). 

\n
The expression \n matches the same string of characters as was matched by an expression enclosed between \( and \) earlier in the same regular expression.  Here n is a digit; the sub-expression specified is that beginning with the n-th occurrence of \( counting from the left.  For example, the expression ^\(.∗\)\1$ matches a line consisting of two repeated appearances of the same string. 

concatenation
The concatenation of regular expressions is a regular expression that matches the concatenation of the strings matched by each component of the regular expression.

\<
The sequence \< in a regular expression constrains the one-character regular expression immediately following it only to match something at the beginning of a “word”; that is, either at the beginning of a line, or just before a letter, digit, or underline and after a character not one of these. 

\>
The sequence \> in a regular expression constrains the one-character regular expression immediately following it only to match something at the end of a “word”; that is, either at the end of a line, or just before a character which is neither a letter, digit, nor underline. 

^
A circumflex (^) at the beginning of an entire regular expression constrains that regular expression to match an initial segment of a line. 

$
A currency symbol ($) at the end of an entire regular expression constrains that regular expression to match a final segment of a line. 

The construction ­^entire regular expression$ constrains the entire regular expression to match the entire line. 

egrep accepts regular expressions of the same sort grep does, except for \(, \), \n, \<, and \>, with the addition of:

∗A regular expression (not just a one-character regular expression) followed by an asterisk (∗) is a regular expression that matches zero or more occurrences of the one-character regular expression.  If there is any choice, the longest leftmost string that permits a match is chosen. 

+A regular expression followed by a plus sign (+) is a regular expression that matches one or more occurrences of the one-character regular expression.  If there is any choice, the longest leftmost string that permits a match is chosen. 

?A regular expression followed by a question mark (?) is a regular expression that matches zero or one occurrences of the one-character regular expression.  If there is any choice, the longest leftmost string that permits a match is chosen. 

|Alternation: two regular expressions separated by | or newline match either a match for the first or a match for the second. 

()A regular expression enclosed in parentheses matches a match for the regular expression. 

The order of precedence of operators at the same parenthesis level is [ ] (character classes), then ∗ + ? (closures), then concatenation, then | (alternation) and newline. 

SYSTEM V REGULAR EXPRESSIONS

The System V version of grep does not accept \< or \> in a regular expression, and accepts the following additional item in a regular expression:

\{m\}

\{m,\}

\{m,n\}
A regular expression followed by \{m\}, \{m,\}, or \{m,n\} matches a range of occurrences of the regular expression.  The values of m and n must be non-negative integers less than 256; \{m\} matches exactly m occurrences; \{m,\} matches at least m occurrences; \{m,n\} matches any number of occurrences between m and n inclusive.  Whenever a choice exists, the regular expression matches as many occurrences as possible. 

EXAMPLES

Search a file for a fixed string using fgrep:

tutorial% fgrep  intro  /usr/man/man3/∗.3∗

Look for character classes using grep:

tutorial% grep  ’[1-8]([CJMSNX])’  /usr/man/man1/∗.1

Look for alternative patterns using egrep:

tutorial% egrep  ’(Sally|Fred) (Smith|Jones|Parker)’  telephone.list

To get the filename displayed when only processing a single file, use /dev/null as the second file in the list:

tutorial% grep  ’Sally Parker’  telephone.list  /dev/null

SEE ALSO

vi(1)visual display-oriented editor based on ex(1)
ex(1)line-oriented text editor based on ed(1)
ed(1)primitive line-oriented text editor
sed(1V)stream editor
awk(1)pattern scanning and text processing language
sh(1)Bourne Shell

DIAGNOSTICS

Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files. 

BUGS

Lines are limited to 1024 characters by grep; longer lines are truncated.

If there is a line with embedded nulls, grep will only match up to the first null; if it matches, it will print the entire line. 

The combination of −l and −v options does not produce a list of files in which a regular expression is not found.  To get such a list, use the C-Shell construct:

foreach file (∗)  if (‘grep "re" $file | wc -l‘ == 0) echo $file  end

Ideally there should be only one grep.

Sun Release 3.2  —  Last change: 8 July 1986

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026