Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

egrep(1)

grep(1)

sed(1)

lex(1)

printf(3S)





   nawk(1)          (Directory and File Management Utilities)          nawk(1)


   NAME
         nawk - pattern scanning and processing language

   SYNOPSIS
         nawk [-F re] [-v var=value] ['prog'] [file...]
         nawk [-F re] [-v var=value] [-f progfile] [file...]

   DESCRIPTION
         nawk scans each input file for lines that match any of a set of
         patterns specified in prog.  The prog string must be enclosed in
         single quotes (') to protect it from the shell.  For each pattern in
         prog there may be an associated action performed when a line of a
         file matches the pattern.  The set of pattern-action statements may
         appear literally as prog or in a file specified with the -f progfile
         option.  Input files are read in order; if there are no files, the
         standard input is read.  The file name - means the standard input.

         Each input line is matched against the pattern portion of every
         pattern-action statement; the associated action is performed for each
         matched pattern.  Any file of the form var=value is treated as an
         assignment, not a filename, and is executed at the time it would have
         been opened if it were a filename, and is executed at the time it
         would have been opened if it were a filename.  The option -v followed
         by var=value is an assignment to be done before prog is executed; any
         number of -v options may be present.

         An input line is normally made up of fields separated by white space.
         (This default can be changed by using the FS built-in variable or the
         -F re option.)  The fields are denoted $1, $2, ...; $0 refers to the
         entire line.

         A pattern-action statement has the form:

               pattern { action }

         Either pattern or action may be omitted.  If there is no action with
         a pattern, the matching line is printed.  If there is no pattern with
         an action, the action is performed on every input line.  Pattern-
         action statements are separated by newlines or semicolons.

         Patterns are arbitrary Boolean combinations ( !, ||, &&, and
         parentheses) of relational expressions and regular expressions.  A
         relational expression is one of the following:

               expression relop expression
               expression matchop regular_expression
               expression in array-name
               (expression,expression, ...  ) in array-name





   7/91                                                                 Page 1









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


         where a relop is any of the six relational operators in C, and a
         matchop is either ~ (contains) or !~ (does not contain).  An
         expression is an arithmetic expression, a relational expression, the
         special expression

               var in array

         or a Boolean combination of these.

         Regular expressions are as in egrep(1).  In patterns they must be
         surrounded by slashes.  Isolated regular expressions in a pattern
         apply to the entire line.  Regular expressions may also occur in
         relational expressions.  A pattern may consist of two patterns
         separated by a comma; in this case, the action is performed for all
         lines between an occurrence of the first pattern and the next
         occurrence of the second pattern.

         The special patterns BEGIN and END may be used to capture control
         before the first input line has been read and after the last input
         line has been read respectively.  These keywords do not combine with
         any other patterns.

         A regular expression may be used to separate fields by using the -F
         re option or by assigning the expression to the built-in variable FS.
         The default is to ignore leading blanks and to separate fields by
         blanks and/or tab characters.  However, if FS is assigned a value,
         leading blanks are no longer ignored.

         Other built-in variables include:

               ARGC          command line argument count

               ARGV          command line argument array

               ENVIRON       array of environment variables; subscripts are
                             names

               FILENAME      name of the current input file

               FNR           ordinal number of the current record in the
                             current file

               FS            input field separator regular expression (default
                             blank and tab)

               NF            number of fields in the current record

               NR            ordinal number of the current record





   Page 2                                                                 7/91









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


               OFMT          output format for numbers (default %.6g)

               OFS           output field separator (default blank)

               ORS           output record separator (default new-line)

               RS            input record separator (default new-line)

               SUBSEP        separates multiple subscripts (default is 034)

         An action is a sequence of statements.  A statement may be one of the
         following:

               if ( expression ) statement [ else statement ]
               while ( expression ) statement
               do statement while ( expression )
               for ( expression ; expression ; expression ) statement
               for ( var in array ) statement
               delete array[subscript] #delete an array element
               break
               continue
               { [ statement ] ... }
               expression  # commonly variable = expression
               print [ expression-list ] [ >expression ]
               printf format [ , expression-list ] [ >expression ]
               next        # skip remaining patterns on this input line
               exit [expr] # skip the rest of the input; exit status is expr
               return [expr]

         Statements are terminated by semicolons, new-lines, or right braces.
         An empty expression-list stands for the whole input line.
         Expressions take on string or numeric values as appropriate, and are
         built using the operators +, -, *, /, %, ^ and concatenation
         (indicated by a blank).  The operators ++ -- += -= *= /= %= ^= > >= <
         <= == != ?:  are also available in expressions.  Variables may be
         scalars, array elements (denoted x[i]), or fields.  Variables are
         initialized to the null string or zero.  Array subscripts may be any
         string, not necessarily numeric; this allows for a form of
         associative memory.  Multiple subscripts such as [i,j,k] are
         permitted; the constituents are concatenated, separated by the value
         of SUBSEP.  String constants are quoted (""), with the usual C
         excapes recognized within.

         The print statement prints its arguments on the standard output, or
         on a file if >expression is present, or on a pipe if | cmd is
         present.  The arguments are separated by the current output field
         separator and terminated by the output record separator.  The printf
         statement formats its expression list according to the format [see
         printf(3S) in the Programmer's Reference Manual].  The built-in
         function close(expr) closes the file or pipe expr.



   7/91                                                                 Page 3









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


         The mathematical functions:  atan2, cos, exp, log, sin, sqrt, are
         built-in.

         Other built-in functions include:

         gsub(for, repl, in)
                   behaves like sub (see below), except that it replaces
                   successive occurrences of the regular expression (like the
                   ed global substitute command).

         index(s, t)
                   returns the position in string s where string t first
                   occurs, or 0 if it does not occur at all.

         int       truncates to an integer value.

         length(s) returns the length of its argument taken as a string, or of
                   the whole line if there is no argument.

         match(s, re)
                   returns the position in string s where the regular
                   expression re occurs, or 0 if it does not occur at all.
                   RSTART is set to the starting position (which is the same
                   as the returned value), and RLENGTH is set to the length of
                   the matched string.

         rand      random number on (0, 1).

         split(s, a, fs)
                   splits the string s into array elements a[1], a[2], a[n],
                   and returns n.  The separation is done with the regular
                   expression fs or with the field separator FS if fs is not
                   given.

         srand     sets the seed for rand

         sprintf(fmt, expr, expr,...)
                   formats the expressions according to the printf(3S) format
                   given by fmt and returns the resulting string.

         sub(for, repl, in)
                   substitutes the string repl in place of the first instance
                   of the regular expression for in string in and returns the
                   number of substitutions.  If in is omitted, nawk
                   substitutes in the current record ($0).

         substr(s, m, n)
                   returns the n-character substring of s that begins at
                   position m.




   Page 4                                                                 7/91









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


         The input/output built-in functions are:

         close(filename)
                   closes the file or pipe named filename.

         cmd | getline
                   pipes the output of cmd into getline; each successive call
                   to getline returns the next line of output from cmd.

         getline   sets $0 to the next input record from the current input
                   file.

         getline <file
                   sets $0 to the next record from file.

         getline x sets variable x instead.

         getline x <file
                   sets x from the next record of file.

         system(cmd)
                   executes cmd and returns its exit status.

         All forms of getline return 1 for successful input, 0 for end of
         file, and -1 for an error.

         nawk also provides user-defined functions.  Such functions may be
         defined (in the pattern position of a pattern-action statement) as

               function name(args,...) { stmts }

         Function arguments are passed by value if scalar and by reference if
         array name.  Argument names are local to the function; all other
         variable names are global.  Function calls may be nested and
         functions may be recursive.  The return statement may be used to
         return a value.

   EXAMPLES
         Print lines longer than 72 characters:

               length > 72

         Print first two fields in opposite order:

               { print $2, $1 }

         Same, with input fields separated by comma and/or blanks and tabs:

               BEGIN { FS = ",[ \t]*|[ \t]+" }
                     { print $2, $1 }



   7/91                                                                 Page 5









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


         Add up first column, print sum and average:

                     { s += $1 }
               END   { print "sum is", s, " average is", s/NR }

         Print fields in reverse order:

               { for (i = NF; i > 0; --i) print $i }

         Print all lines between start/stop pairs:

               /start/, /stop/

         Print all lines whose first field is different from previous one:

               $1 != prev { print; prev = $1 }

         Simulate echo(1):

               BEGIN {
                     for (i = 1; i < ARGC; i++)
                           printf "%s", ARGV[i]
                     printf "\n"
                     exit
                     }

         Print a file, filling in page numbers starting at 5:

               /Page/      { $2 = n++; }
                     { print }

         Assuming this program is in a file named prog, the following command
         line prints the file input numbering its pages starting at 5:  nawk
         -f prog n=5 input.

   SEE ALSO
         egrep(1), grep(1), sed(1).
         lex(1), printf(3S) in the Programmer's Reference Manual.
         The awk chapter in the User's Guide.
         A. V. Aho, B. W. Kerninghan, P. J. Weinberger, The AWK Programming
         Language Addison-Wesley, 1988.

   NOTES
         nawk is a new version of awk that provides capabilities unavailable
         in previous versions.  This version will become the default version
         of awk in the next major UNIX system release.

         Input white space is not preserved on output if fields are involved.





   Page 6                                                                 7/91









   nawk(1)          (Directory and File Management Utilities)          nawk(1)


         There are no explicit conversions between numbers and strings.  To
         force an expression to be treated as a number add 0 to it; to force
         it to be treated as a string concatenate the null string ("") to it.


















































   7/91                                                                 Page 7





Typewritten Software • bear@typewritten.org • Edmonds, WA 98026