Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

egrep(1)

fgrep(1)

grep(1)

lex(1)

sed(1)

expressions(5)

awk(1)                                                               awk(1)

NAME
     awk, nawk - pattern-directed scanning and processing language

SYNOPSIS
     awk [-Fc] [-v initialization] ... [--] prog [initialization ...]
         [file ...]

DESCRIPTION
     awk is a programmable text manipulation system. When you call awk you
     specify an awk program it is to execute and the files it is to pro-
     cess. The actions defined in the program are then performed on the
     basis of the specified files. awk does not alter its input files. The
     results of the actions it performs are by default written on standard
     output.

     awk offers the following advantages over text manipulation programs
     such as egrep and sed:

     -  awk operates on one record at a time. As with egrep and sed, an
        input record is defined as one line by default; but with awk you
        can change this setting and define some other unit of text as the
        record.

     -  Each input record is split into fields which can be accessed indi-
        vidually.

     -  A pattern (selection criterion) may be a condition defined by the
        logical combination of extended regular expressions and relational
        operators.

     -  You can program any actions that you require. awk is a high-level
        C-like programming language.

     A detailed description of awk is provided below in the following sec-
     tions:

     ⊕  Typical awk applications

     ⊕  Structure of an awk program

     ⊕  Operation of the awk command

     ⊕  The input file (records, fields, special variables)

     ⊕  Basic elements of the awk language (comments, constants, variables)

     ⊕  Expressions

     ⊕  Patterns

     ⊕  Actions (control-flow statements, functions).




Page 1                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

OPTIONS
     -Fc  Defines the field separator character for the input record (input
          field separator).

          c    Regular expression that defines a character to be inter-
               preted as the input field separator. Separators do not form
               part of the fields.

               Note:

               To be able to use "t" as the input field separator, you must
               specify it as follows on the awk command line or in the
               BEGIN section of the awk program:

               awk -F[t] ...   or   BEGIN {FS=t ...}

               -Fc not specified:

               Blanks and tabs act as field separators.

     -v initialization
          Assignments in the form: var=value.

          The var variable which appears in the program is initialized to
          value.

          var    Name of the variable to be initialized.

          value  Initial value to be assigned to var. value can be defined
                 in exactly the same way as an environment variable on
                 shell level.

     --   If prog begins with a dash (-), the end of the command-line
          options must be marked with --.

     prog awk program argument. Possible forms for prog are:

          'awk-program', i.e. an awk program written on the command line,
          or:

          -f progfile, i.e. the name of a file containing an awk program.













Page 2                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

          'awkprogram'
               An awk program written on the command line.

               You should always enclose the awk program in single quotes
               in order to prevent the shell from interpreting metacharac-
               ters. If the program is more than one line long, you must
               escape the newline character with a backslash.

               Example:

               Process the file named input and display all lines that have
               a "0" in the third field:

               $ awk '$3 == 0' input

          -f progfile
               The awk program is located in the file named progfile.

               You can specify a number of awk programs. Each awk program
               specified must be preceded by -f. Where multiple specifica-
               tions are made, awk processes the files in the specified
               order.

     initialization
          Assignments in the form: var=value.

          The var variable (whether it appears in the awk program or not)
          is initialized to value. initialization and file may be specified
          in any order. The assignment is made at the time when the named
          file is opened.

          Note:

          The $ variables (see BASIC ELEMENTS OF THE AWK LANGUAGE below)
          cannot be initialized in this way.

          var    Name of the variable to be initialized. The name must not
                 begin with $.

          value  Initial value to be assigned to var. value can be defined
                 in exactly the same way as an environment variable on
                 shell level.

     file Name of the text file to be processed. You may list more than one
          file if you wish. Files are read in the order in which they are
          listed. If file is a dash (-), awk reads from standard input.

          file not specified:

          awk reads from standard input. awk reads input one record at a
          time, processes it, and after each line outputs the result for
          that record. Hitting <CTRL-D> terminates your input.


Page 3                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   Known bugs

     The assignment results via initialization - unlike the description
     under initialization (see above) - even before you call awk. With
     SINIX V5.43 or later, an assignment is still only possible before you
     begin awk via "-v initialization", as defined in XPG4.

TYPICAL AWK APPLICATIONS
     awk is a tool which makes text manipulation tasks easy to accomplish.
     Typical applications for awk include:

     -  selectively extracting data from files

     -  checking the contents of files

     -  performing calculations on the data in a file

     -  changing the format of input data.

     Using four simple examples, this section demonstrates how awk can be
     used.

     A file called supplies contains a list of office supplies. It includes
     the name of each article, along with its quantity and unit price:

     Pencil      100     0.60
     Table         5   345.00
     Lamp         20    79.80
     Paper        75     1.00
     Diskette   1000     2.40
     Envelope   1500     0.20

     Example 1

     Select all articles with a quantity greater than 100:

     $ awk '$2 > 100 {print}' supplies
     Diskette   1000     2.40
     Envelope   1500     0.20

     With $2 you access the second field of a line, which in this case is
     the quantity of each article. If the quantity is greater than 100, the
     condition is fulfilled, and the print function is executed. Since no
     arguments were specified for print, the whole line is output.










Page 4                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Example 2

     Calculate the total price for all articles with a quantity greater
     than 100 and print this total along with the article name:

     $ awk '$2 > 100 {print $1 "\t" $2*$3}' supplies
     Diskette        2400
     Envelope        300

     Three arguments are entered for the print function in this example.
     The following is output:

     $1      article name (first field)

     \t      tab character

     $2*$3   quantity (second field) times unit price (third field)

     Example 3

     Include a heading in the output:

     $ awk 'BEGIN    {print "Article \tTotal"}
     >      $2 > 100 {print $1 "\t\t" $2*$3}' supplies
     Article         Total
     Diskette        2400
     Envelope        300

     This example illustrates the use of the BEGIN pattern. awk executes
     the action after BEGIN only once, i.e. when the program is started.
     The heading is therefore printed only once at the beginning.

     Example 4

     Print a grand total of all amounts at the end.

     For this purpose we use a variable called sum, which is initialized to
     zero in the BEGIN pattern. The product of column 2 and column 3 is
     calculated for each line, and all the products are summed up:

     $ awk 'BEGIN    {sum=0; print "Article \tTotal"}
     >     $2 > 100 {print $1 "\t\t" $2*$3; sum += $2*$3}
     >     END      {print "\nGrand total: " sum} ' supplies
     Article         Total
     Diskette        2400
     Envelope        300
     Grand total:    2700

     This example demonstrates the use of the END pattern. awk executes the
     action after END only once, i.e. before termination of the program.
     The grand total of all subtotals is therefore printed just once at the
     end.


Page 5                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

STRUCTURE OF AN AWK PROGRAM
     An awk program can consist of a BEGIN section, a main section, and an
     END section, structured as shown below:
     ______________________________________________________________________

     [ BEGIN {action} ]                                       BEGIN-section
     [[pattern] {action}                                       main-section
     | pattern [{action}]
     | functiondefinition
     .
     .
     .
                                  ]
     [ END {action} ]                                           END-section
     ______________________________________________________________________

     pattern
          The pattern indicates which data is to be selected from the input
          files (see Patterns below).

     action
          The action indicates what to do with data that matches the pat-
          tern (see Actions below).

     function-definition
          A functiondefinition enables you to define your own functions
          (see FUNCTIONS below).

     At least one of the three sections (pattern, action or function-
     definition) must be present.

     In a pattern {action} pair, either the pattern or the action can be
     omitted. If the action is omitted, each line that matches the pattern
     is output; omitting the pattern causes the action to be performed on
     all lines.

     The definition of a user-defined function may appear at any position
     in the main section.

     Each of the following elements must be located at the start of a line
     (following any number of blanks or tabs):

     -  the BEGIN section,

     -  the [pattern]{action} and pattern [{action}] pairs,

     -  the function definitions,

     -  the END section.





Page 6                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

OPERATION OF THE AWK COMMAND
     awk executes the awk program that is specified by the user, proceeding
     in the following sequence:

     1. Initial processing

        The first step performed by awk is to initialize any variables that
        may have been defined. If there is a BEGIN section including an
        action, awk then executes the action specified there. The action in
        the BEGIN section is executed just once, before the first line is
        processed.

     2. File processing

        Next awk processes the specified input files by reading the input
        records sequentially. For each input record, awk tries to match
        each pattern in the order that is specified in the awk program. If
        a pattern is matched, i.e. the selection criterion is fulfilled,
        the associated action is performed.

        If no pattern is specified for an action, awk performs the action
        for every record.

        If no action is specified for a pattern, the default action is to
        output (print) the record.

        Multiple input files are processed in the specified order.

     3. Final processing

        When all the specified files have been processed, awk performs the
        action in the END section, if one has been included. awk then exits.

THE INPUT FILE
     An input file consists of records that are subdivided into fields.

     Records

     Records are separated by a record separator. The record separator does
     not form part of a record. By default, a record is one line, and the
     record separator is the newline character. However, you do have the
     option of changing this setup by assigning any single character to the
     special variable RS (Record Separator). If you specify a string of
     characters as a value for RS, only the first character will be taken
     into account. The ordinal number of the current record is available in
     the variable NR (Number of Record). If there is more than one input
     file, NR counts from the start of the first file to the end of the
     last one. The special variable $0 addresses the whole of the current
     record. Further information on variables is provided in the section on
     BASIC ELEMENTS OF THE AWK LANGUAGE.




Page 7                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Fields

     Each record is split into fields separated by one or more field
     separators. The default field separator is white space (any sequence
     of tabs and blanks), but you do have the option of changing this by
     assigning any other character to the special variable FS (Field
     Separator). You can make this assignment either in the awk program or
     by using option -F on the command line. The value assigned to FS is
     interpreted as an extended regular expression [see expressions(5)].

     Example 1

     To define the characters x and y as alternate field separators:

     syntax on the awk command line: -F"[xy]"

     syntax in the awk program: FS="[xy]"

     Example 2

     To define the field separator as one or more occurrences of the char-
     acter x:

     syntax on the awk command line: -Fx+

     syntax in the awk program: FS=x+

     The default setting (any sequence of blanks and tabs) can be expressed
     by the regular expression "[<blank>\t]+", where <blank> stands for a
     blank, and \t represents a tab.

     Note that the newline character is always interpreted as a field
     separator, regardless of the value assigned to FS!

     The number of fields in the current record is stored in the variable
     NF (Number of Fields). Individual fields of the current record are
     addressed by the predefined variables $1, $2, to $NF. Further informa-
     tion on variables is provided in the section on BASIC ELEMENTS OF THE
     AWK LANGUAGE.















Page 8                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Example 1

     Default setup

     Field 1 Field 2   ...       Field 5
     This    is        the first record          <--  Record 1
     |__|    ||                  |____|

     and this is the second record               <--  Record 2
     |_|                    |____|
     Field 1      ...       Field 6

     Example 2

     Modified setup: RS="%"; FS=":";

      Field 1   Field 2     Field 3
     %Name    : Address : Phone number                    <--  Record 1
      |______| |_______| |___________|

        Field 1             Field 2
     %Peter Smith:719 Charles St. Baltimore MD 21227   --\
      |_________| |________________________________|      |->  Record 2
     301-7882874%                                      __/
     |_________|
      Field 3

     Rules for record and field separators

     ⊕  Default settings for record separators

        -  The default record separator is the newline character.

        -  If the null string is assigned to RS (RS=""), the file is
           treated as a single record. If several files are specified, each
           file will consist of a single record (which means that the ulti-
           mate value of NR will be equal to the number of files).

     ⊕  Default settings for field separators

        -  If the record separator is newline, the field separator defaults
           to blanks and tabs.

        -  If the record separator is not a newline, the newline character
           always counts as a field separator, regardless of which charac-
           ter has been explicitly defined as the field separator (see
           Fields above, Example 2).







Page 9                       Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

        -  If you explicitly assign a blank to FS, either with -F" " on the
           awk command line or by using the assignment FS=" ", then blanks
           and tab characters are treated as field separators.

        -  On the other hand, if you explicitly assign the tab character to
           FS (FS="\t"), then only the tab character is treated as the
           field separator and not the blank.

     ⊕  Leading field separators and field separator strings

        -  The following applies to blanks, tabs and newlines as field
           separators:

           1. Leading field separators are ignored.

           2. Multiple occurrences of a field separator are treated as a
              single field separator (see EXAMPLES, Example 9).

        -  For all other field separators, leading field separators are
           counted. In multiple occurrences of a field separator, each
           character is counted separately. Thus two consecutive field
           separators are deemed to have an empty field between them (see
           EXAMPLES, Example 10).

     ⊕  Changing separators

        If you need a number of different record separators in one file,
        you can change RS within the awk program. The new record separator
        comes into effect as soon as the assignment to RS has been imple-
        mented. Similarly, you can change FS within the awk program, should
        you require a number of different field separators in one file. The
        new field separator comes into effect as soon as the assignment to
        FS has been implemented.

   Special variables for the input file

     The following table contains all special awk variables pertaining to
     the input file. The value awk usually assigns to these variables is
     indicated in the second column.















Page 10                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     ______________________________________________________________________
    | Variable  | Value set by awk                                        |
    |___________|_________________________________________________________|
    | FILENAME  | Name of the current input file, - for standard input    |
    |___________|_________________________________________________________|
    | FS        | Input field separator (default: any sequence of blanks  |
    |           | and tabs)                                               |
    |___________|_________________________________________________________|
    | NF        | Number of fields in the current record                  |
    |___________|_________________________________________________________|
    | NR        | Ordinal number of the current record from start of input|
    |___________|_________________________________________________________|
    | FNR       | Ordinal number of the current record in the current file|
    |___________|_________________________________________________________|
    | RS        | Input record separator (default: newline)               |
    |___________|_________________________________________________________|
    | $0        | Current record                                          |
    | $1        | First field of the current record                       |
    | $2        | Second field of the current record                      |
    | ...       |                                                         |
    | $NF       | Last field of the current record                        |
    |___________|_________________________________________________________|

     You can change these variables within an awk program if you wish. This
     does not alter the input file. Further information on variables is
     provided in the next section, BASIC ELEMENTS OF THE AWK LANGUAGE.




























Page 11                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

BASIC ELEMENTS OF THE AWK LANGUAGE
     This section gives a synopsis of the basic elements of the awk
     language. You will need these elements in order to define pattern and
     action pairs.

   Comments

     You can include comments in an awk program, as in a shell script. A
     comment begins with the # character and continues till the end of the
     line.

   Constants

     There are two types of constant:

     ______________________________________________________________________

     number
     string
     ______________________________________________________________________

     number
          A number (numeric constant) is a signed or unsigned integer or
          floating point number. awk does not check its format. If your
          number contains invalid characters, awk attempts to filter out a
          valid part and ignores the rest.

          integer
               An integer is a sequence of digits from 0 to 9.

          floating point number
               A floating point number consists of a mantissa with or
               without an exponent.

               The mantissa comprises an integer with or without a frac-
               tional part. The fractional part is represented by a radix
               character and an integer.

     string
          A string (alphanumeric constant) is a sequence of characters,
          enclosed in double quotes "...". If the double quotes are omit-
          ted, awk will interpret the string as a variable name, a number,
          or an operator.

          character
               Even a single character must be enclosed in double quotes to
               prevent awk from interpreting it as the name of a variable.
               A character is any representable character from the currently
               valid character set [see ascii(5) and meta-ascii(5)] or one of
               the following metacharacters, represented as in C:




Page 12                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

               \"   for   "
               \\   for   \
               \n   for   newline
               \t   for   tab
               \b   for   backspace
               \r   for   carriage return
               \f   for   form feed

   Variables

     awk allows you to use simple variables and arrays to store values. The
     special variables are predefined; others can be defined by the user.

     Name of a variable
          The name of a user-defined variable can be any string made up of
          underscores (_), uppercase and lowercase letters and digits,
          beginning with a letter or an underscore. Internal expressions of
          awk, e.g. control-flow statements, must not be used as variable
          names.

     Data type
          Variables do not have a data type. You can thus assign either a
          number or a string to any variable. If the context is clearly
          numeric, variables are treated as numeric; otherwise, they
          default to alphanumeric. Numeric variables are converted into
          alphanumeric variables internally using the format stored in
          CONVFMT.

          Example:

          x = "Miller";    # Variable x contains the string Miller
          x = "3"+4   ;    # Variable x has a value of 7

     Declaration
          awk variables do not need to be explicitly declared. User-defined
          variables are automatically declared the first time they are
          used.

     Initialization
          Special variables are initialized to predefined values by awk.
          Depending on the context, user-defined variables are initialized
          by awk to the null string or to 0 by default. If you wish, you
          can specify other initial values when you call awk.

     Note:

     -  When i>NF, $i will not always be the null string.

     -  $ variables cannot be initialized on the command line.





Page 13                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Special variables

     awk recognizes the special variables listed in the table below. The
     values awk usually assigns to these variables are indicated in the
     table. New values may be assigned to the variables by the user.
     _______________________________________________________________________
    | Variable  |  Value set by awk                                        |
    |___________|__________________________________________________________|
    | ARGC      |  Number of elements in the array ARGV                    |
    |___________|__________________________________________________________|
    | ARGV      |  Array holding the command line arguments (excluding     |
    |           |  options and the prog argument), numbered from 0 to      |
    |           |  ARGC-1                                                  |
    |___________|__________________________________________________________|
    | CONVFMT   |  Format for the internal conversion of numbers into char-|
    |           |  acter strings (see FUNCTIONS below, printf) (default:   |
    |           |  %%%%.6g, up to 6 places after the decimal point)        |
    |___________|__________________________________________________________|
    | ENVIRON   |  Array holding the values of environment variables, where|
    |           |  the indexes are the names of the variables              |
    |___________|__________________________________________________________|
    | FILENAME  |  Name of the current input file, - for standard input    |
    |___________|__________________________________________________________|
    | FS        |  Input field separator (default: any sequence of blanks  |
    |           |  and tabs)                                               |
    |___________|__________________________________________________________|
    | NF        |  Number of fields in the current record                  |
    |___________|__________________________________________________________|
    | NR        |  Ordinal number of the current record from start of input|
    |___________|__________________________________________________________|
    | FNR       |  Ordinal number of the current record in the current file|
    |___________|__________________________________________________________|
    | OFS       |  Output field separator (default: one blank)             |
    |___________|__________________________________________________________|
    | ORS       |  Output record separator (default: newline)              |
    |___________|__________________________________________________________|
    | OFMT      |  Output format for floating point numbers (see FUNCTIONS |
    |           |  below, printf) (default: %%%%.6g, up to 6 places after  |
    |           |  the decimal point)                                      |
    |___________|__________________________________________________________|
    | RS        |  Input record separator (default: newline)               |
    |___________|__________________________________________________________|
    | RLENGTH   |  Length of the string matched by the match function      |
    |___________|__________________________________________________________|
    | RSTART    |  Starting position of the string matched by the match    |
    |           |  function. Numbering begins with 1. This value always    |
    |           |  correspondents to the value returned by the match func- |
    |           |  tion.                                                   |
    |___________|__________________________________________________________|





Page 14                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     _______________________________________________________________________
    | SUBSEP    |  Subscript string separator for multi-dimensional arrays.|
    |           |  The default setting is \034.                            |
    |___________|__________________________________________________________|
    | $0        |  Current record                                          |
    |___________|__________________________________________________________|
    | $n        |  Field n of the current record                           |
    |___________|__________________________________________________________|
    | $NF       |  Last field of the current record                        |
    |___________|__________________________________________________________|

     What is the effect of changing special variables?

     Example:

     The assignment

          $1 = "new";

     assigns the string new to $1; but this does not actually alter the
     first field of the current input record.

     This also applies to the following awk settings relating to the input
     file:

     1.   The current input file does not change when you assign a new name
          to FILENAME.

     2.   When you assign a value to a variable $i where i>NF, NF is
          assigned the value i.

     3.   If you assign a new value to NR, you only alter the number
          assigned to the current line; you do not move to a different
          line.

          Example:

               The contents of $0 remain the same even if NR is modified:

               {print NR, $0; NR=NR+34; print NR, $0}

               A typical output would then be:

               10 This is the tenth line
               44 This is the tenth line

     Caution:
          When you assign a new value to a variable, its old value is
          deleted. Thus, if you change NF, for example, the information on
          the number of fields in the current record is lost.




Page 15                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Peculiarity of $ variables:

     You can specify the number of a $ variable as a constant or as an
     expression which evaluates to the number.

     Example:

     You can use

          $(NF-1)

     to access the second-last field.

   Array

     An array is a set of constants or variables.

     An array element is addressed as follows:
     ______________________________________________________________________

     arrayname[index]
     ______________________________________________________________________

     arrayname
          Name of a variable.

     index
          A simple variable.

          The index may be numeric or alphanumeric. The index you specify
          can therefore be a number, a string, or an expression that evalu-
          ates to an index value.

     awk provides two special types of arrays:

     -  Dynamic arrays

        Arrays, like simple variables, do not need to be declared. Above
        all, there is no need to define dimensions. New array elements are
        created automatically as and when required.

     -  Associative arrays

        Individual array elements can be accessed via an alphanumeric
        index.

        A special control-flow statement is provided in order to process
        all elements of an associative array:

        for(index in array) statement




Page 16                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

        index assumes the index values present to this point in random
        order, and the specified statement is executed once for each array
        element (see CONTROL-FLOW STATEMENT below, for).

     Example:

     A file called expenses contains various expenses incurred. For each
     item of expenditure the file shows the date, month, amount, and a
     brief description, with a colon to separate them. For example:

     01:January:   40.78:Supplies
     05:January: 6789.00:Laser printer
     23:March:    240.32:Lamps
     11:January:  478.00:Chairs
     01:February:  45.00:Journals

     Using an associative array you can easily calculate total expenditure
     for each month from the data in this file. The program in the example
     uses an array called mexpenses and the names of the months as an
     alphanumeric index. For each line, the expenses in the third field
     ($3) are summed up to produce total expenditure for each month appear-
     ing in the second field ($2).

     $ awk 'BEGIN {FS=":"}
     >      {mexpenses[$2] += $3;}
     >      END {for (i in mexpenses) print "Total spent in",
     >           i, mexpenses[i]  } ' expenses
     Total spent in January 7307.78
     Total spent in February 45
     Total spent in March 240.32

   Expressions

     An expression can be any of the following:

     -  constant

     -  variable

     -  functioncall

     -  unop expression

     -  expression binop expression

     -  (expression)

     -  expression ? expression : expression

     constant
        Numeric or alphanumeric constant (see BASIC ELEMENTS OF THE AWK
        LANGUAGE above).


Page 17                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     variable
        Variable (see BASIC ELEMENTS OF THE AWK LANGUAGE above).

     functioncall
        Invocation of a predefined function (see FUNCTIONS below).

     expression
        Expression.

     unop
        Unary operator (see the table of operators below).

     binop
        Binary operator (see the table of operators below).

     Expressions are evaluated and return a value. They may appear both in
     patterns and in actions.

     awk operators

     awk recognizes all C operators plus the operators for pattern matching
     and string concatenation. The following table lists all awk operators
     in ascending order of precedence. Operators in the same row have the
     same precedence.

     Warning:

     The precedences have changed since previous versions (e.g. "!"). Check
     existing awk programs for ambiguous instances.

























Page 18                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Expression Operators

  _________________________________________________________________________
 | Operation            | Operators    | Example | Meaning of Example     |
 |______________________|______________|_________|________________________|
 | assignment           | =  +=  -=  *=| x *= 2  | x = x * 2              |
 |                      | /=  %=  ^=   |         |                        |
 |______________________|______________|_________|________________________|
 | conditional          | ?:           | x?y:z   | if x is true then y    |
 |                      |              |         | else z                 |
 |______________________|______________|_________|________________________|
 | logical OR           | ||           | y || y  | 1 if x or y is true, 0 |
 |                      |              |         | otherwise              |
 |______________________|______________|_________|________________________|
 | logical AND          | &&           | x && y  | 1 if x and y are true, |
 |                      |              |         | 0 otherwise            |
 |______________________|______________|_________|________________________|
 | array membership     | in           | i in a  | 1 if a[i] exists, 0    |
 |                      |              |         | otherwise              |
 |______________________|______________|_________|________________________|
 | matching             | ~  !~        | $1 ~ /x/| 1 if the first field   |
 |                      |              |         | contains an x, 0 other-|
 |                      |              |         | wise                   |
 |______________________|______________|_________|________________________|
 | relational           | <  <=  ==  !=| x == y  | 1 if x is equal to y,  |
 |                      | >=  >        |         | 0 otherwise            |
 |______________________|______________|_________|________________________|
 | concatenation        |              | "a" "bc"| "abc"; there is no     |
 |                      |              |         | explicit concatenation |
 |                      |              |         | operator               |
 |______________________|______________|_________|________________________|
 | add, subtract        | +  -         | x + y   | sum of x and y         |
 |______________________|______________|_________|________________________|
 | multiply, divide, mod| *  /  %      | x % y   | remainder of x divided |
 |                      |              |         | by y                   |
 |______________________|______________|_________|________________________|
 | unary plus and minus | +  -         | -x      | negated value of x     |
 |______________________|______________|_________|________________________|
 | logical NOT          | !            | !$1     | 1 if $1 is zero or     |
 |                      |              |         | null, 0 otherwise      |
 |______________________|______________|_________|________________________|
 | exponentiation       | ^  **        | x ^ y   | x to the power of y    |
 |______________________|______________|_________|________________________|
 | increment, decrement | ++  --       | ++x, x++| add 1 to x             |
 |______________________|______________|_________|________________________|
 | field                | $            | $i+1    | value of i-th field,   |
 |                      |              |         | plus 1                 |
 |______________________|______________|_________|________________________|
 | grouping             | ( )          | ($i)++  | add 1 to the value of  |
 |                      |              |         | i-th field             |
 |______________________|______________|_________|________________________|



Page 19                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Evaluation of expressions

     Since no data type is prescribed for the operands, you can freely mix
     numeric and alphanumeric constants. awk determines from the context
     whether a numeric or alphanumeric operation is required.

     Please note that, as in C, there are no special truth values. Like C,
     awk treats a value of 0 as false and a non-zero value as true. This
     means that any non-zero value as an argument of a logical operation is
     held to be true. If the result of a logical operation is true, it is
     represented as 1.

     Example:

          $4 ~ /Asia/

     is 1 if the fourth field of the current line contains Asia as a sub-
     string, or 0 if it does not.

   Patterns

     Patterns (selection criteria) are specified by the user as a means of
     indicating which data is to be selected from the input files. A pat-
     tern can have any of the following forms:

     -  /regexp/

     -  relexp

     -  matchexp

     -  patternrange

     -  compoundpattern

     /regexp/
        Regular expression.

        awk supports extended regular expressions [see expressions(5)]. A
        regular expression is enclosed in slashes /.../.

        Warning:

        Previous versions of awk had a special syntax for repeats (m,n).
        Existing awk scripts should therefore be checked.

        Example:

        A regular expression matching any number of occurrences of a, b or
        c.

           /[abc]+/


Page 20                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     relexp
        relexp is an expression (see Expressions above) featuring rela-
        tional operators. The operators and their meanings are:

        a >  b      a greater than b
        a >= b      a greater than or equal to b
        a <  b      a less than b
        a <= b      a less than or equal to b
        a == b      a equal to b
        a != b      a not equal to b

        Operands a and b are any expressions. If both operands are numeric,
        the comparison is numeric; if not, it is alphanumeric.

     matchexp
        matchexp is an expression (see Expressions above) featuring pattern
        matching operators. It involves the comparison of a regular expres-
        sion (pattern) with a string. The pattern matching operators and
        their meanings are:

        str ~ p     string str must match pattern p

        str !~ p    string str must not match pattern p

        Using matchexp as a pattern allows you to select individual fields.

        Example:

        Select all records with a first field starting with A or a:

             $1 ~ /^[Aa]/

        The regular expression ^[Aa] represents strings that begin with A
        or a. The first field of the record ($1) must match (~) the regular
        expression, i.e. begin with A or a.

     patternrange
        A pattern range takes the form:

        /regexp/, /regexp/

        Specifying a range causes the associated action to be executed for
        all records that lie within the range. The limits of the range
        (start and end) are defined by two regular expressions. The range
        begins with the first record containing a string that matches the
        first regular expression and ends with the first record containing
        a string that matches the second regular expression.







Page 21                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

        Example:

        Select the range from the first line beginning with C to the first
        line beginning with K and output the first field of every line in
        the selected range.

             /^C/, /^K/ {print $1}

     compoundpattern
        Logical operators (see Expressions above) can be used to negate
        patterns and to combine several of them to form a single pattern.
        The logical operators and their meanings are:

        !pat           Negation of pattern pat

        pat1 || pat2   pat1 or pat2. The criterion is satisfied if pat1 or
                       pat2 matches.

        pat1 && pat2   pat1 and pat2. The criterion is satisfied if both
                       pat1 and pat2 match.

        (pat)          Parentheses.

        A compound condition is evaluated from left to right.

        Example:

        Match all records that have an even number of fields and a letter
        between M (inclusive) and Q (exclusive) in the first field.

             NF%2==0 && $1 >= "M" && $1 < "Q"

        You can generally combine patterns in several ways in order to make
        the same selection. Thus, if the currently valid collating sequence
        defines the range [M-Q] as the uppercase letters M, N, O, P and Q,
        the above selection could also be made with pattern matching opera-
        tors:

             NF%2==0 && $1 ~ /^[MNOP]/

        Since the first awk condition depends on the collating sequence of
        the currently valid character set, it may not return the same
        result in every case. The second awk line, by contrast, will always
        select only those records in which the first field begins with the
        letter M, N, O or P.









Page 22                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   Actions

     Actions indicate what to do when a pattern is matched. An action will
     typically involve processing one of the selected files. An action has
     to begin in the same line as the associated pattern. If this is not
     possible, the newline character must be escaped with a backslash.
     Blanks and tabs between the action and the pattern are ignored.

     An action comprises one or more statements and must be enclosed in
     curly braces { } as shown below:
     ______________________________________________________________________

     {statement [;statement] ...}
     ______________________________________________________________________

     Statements

     A statement can be any of the following:

     -  expression

     -  controlstatement

     expression
        An expression is evaluated but is not put to any further use unless
        expression is in the form of an assignment, an increment or a
        decrement (see Expressions above).

     controlstatement
        A controlstatement allows you to control the flow of an awk pro-
        gram (see CONTROL-FLOW STATEMENTS below).

     A single statement may be spread over several lines, in which case
     each line except the last must end with a backslash. The backslash
     escapes (cancels the effect of) the newline character.

     Multiple statements

     You can group together a number of statements within one pair of curly
     braces { }. Statements are delimited by means of:

     -  a semicolon ;

     -  a right brace }

     -  a newline character.

CONTROL-FLOW STATEMENTS
     Control-flow statements allow you to control the flow of an awk pro-
     gram. awk recognizes the following control-flow statements:




Page 23                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     __________________________________________________________
    | Statement      |  Meaning                               |
    |________________|________________________________________|
    | break          |  terminate a loop                      |
    |________________|________________________________________|
    | continue       |  skip remainder of loop                |
    |________________|________________________________________|
    | exit           |  terminate the awk program             |
    |________________|________________________________________|
    | for            |  loop counter and looping an array     |
    |________________|________________________________________|
    | if             |  conditional statement                 |
    |________________|________________________________________|
    | next           |  skip to the next input record         |
    |________________|________________________________________|
    | while          |  execute iteratively                   |
    |________________|________________________________________|
    | do             |  execute iteratively                   |
    |________________|________________________________________|
    | delete array[i]|  delete element i of the named array   |
    |________________|________________________________________|
    | return x       |  return from a function with a value   |
    |________________|________________________________________|
    | return         |  return from a function without a value|
    |________________|________________________________________|

     The control-flow statements are described below in alphabetical order.

   break - Terminate a loop

     break can be used in the body of a for, while, or do loop. break
     causes an immediate exit from the enclosing loop.
     ______________________________________________________________________

     break
     ______________________________________________________________________

     Example:

     While records continue to start with a dot, keep reading in the next
     record. Terminate the loop if the second field of the retrieved record
     is greater than 1000.

        { while($1 ~ /^\./)
            {
               getline;
               if($2 > 1000) break;
            }
        }





Page 24                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   continue - Skip remainder of loop

     continue can be used in the body of a for, while or do loop. The con-
     tinue statement causes the current iteration to be terminated and the
     next one to begin.

     ______________________________________________________________________

     continue
     ______________________________________________________________________

     Example:

     Print even fields only:

        {
           i=1;
           while(i++ <= NF)
              {
                if(i%2) continue;
                else print $i
              }
        }

   delete - Delete an array element

     ______________________________________________________________________

     delete array[i]
     ______________________________________________________________________

     delete can be used to delete all elements from an array.

     Example:

     This loop removes all the elements from the array max:

       for (i in max)
          delete max[i]

   do - Execute iteratively

     The statement in a do loop (or a do-while loop) is executed itera-
     tively while a specified condition continues to be satisfied. In con-
     trast to the while loop, the statement in a do loop is always executed
     at least once.

     ______________________________________________________________________

     do {statement} while (expression)
     ______________________________________________________________________



Page 25                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     statement
          Statement that is executed in each iteration of the loop. If
          several statements are to be executed, they have to be grouped
          together in curly braces { }.

     expression
          Expression (see Expressions above) that specifies the condition.

     Example:

          Print out the individual fields of a record:

          { i=0; do {print $(++i)} while (i != NF) }

   exit - Terminate the awk program

     exit terminates the awk program.

     If an END section is present, awk executes the action specified in it;
     if not, the program is terminated immediately.

     ______________________________________________________________________

     exit
     ______________________________________________________________________

     Example:

     If the commercial at symbol @ appears in the input, print the result
     and terminate processing:

        ...
        /@/ {exit}
        ...
        END {print result}

   for - Loop counter

     The statement in a for loop is executed iteratively while a condition
     continues to be satisfied.

     ______________________________________________________________________

     for(expr1; expr2; expr3) statement
     ______________________________________________________________________









Page 26                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     expr1
          Expression (see Expressions above).

          expr1 is evaluated once at the start of the for statement. expr1
          is often used to initialize incrementing variables.

          Example:

               i=1

     expr2
          Expression (see Expressions above).

          expr2 is evaluated before each iteration. The specified statement
          is executed only if expr2 is non-zero (true); otherwise, the loop
          is terminated.

          Example:

               i<10

     expr3
          Expression (see Expressions above).

          expr3 is evaluated after each iteration. When incrementing vari-
          ables are used, expr3 increments the variable.

          Example:

               i++

     statement
          Statement that is executed in each iteration of the loop. If
          several statements are to be executed, they have to be grouped
          together in curly braces { }.

     Example:

          Print out the fields of the current record in reverse order:

               {for(i=NF; i>0; i--) print $i}

   for - Looping an array

     This variant of the for statement is a special awk facility for the
     handling of arrays.

     ______________________________________________________________________

     for(index in array) statement
     ______________________________________________________________________



Page 27                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     index
          Variable (see BASIC ELEMENTS OF THE AWK LANGUAGE above) that
          assumes all values of the elements of array in random order. The
          index can be numeric or alphanumeric.

     array
          Array to be processed.

     statement
          Statement to be executed for each array element. If several
          statements are to be executed, they have to be grouped together
          in curly braces { }.

     Example:

     The array named month contains the number of days in each month. Each
     array element is subscripted with the name of the month, e.g.

     month["January"]=31.

     The following awk program prints the name of each month together with
     the number of days in it.

        $ awk ' BEGIN { month["January"]=31;
        >               month["February"]=28;
        >               month["March"]=31;
        >               month["April"]=30;
        >               month["May"]=31;
        >               month["June"]=30;
        >               month["July"]=31;
        >               month["August"]=31  }
        >       END { for(i in month) print i,"has",month[i],"days" } '
        May has 31 days
        August has 31 days
        July has 31 days
        April has 30 days
        June has 30 days
        January has 31 days
        March has 31 days
        February has 28 days

   if - Conditional statement

     The statement in an if construct is executed if the specified condi-
     tion is satisfied.

     ______________________________________________________________________

     if(expr) statement1 [else statement2]
     ______________________________________________________________________




Page 28                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     expr Expression (see Expressions above) that defines the condition to
          be satisfied. If expr is non-zero (true), statement1 is executed.

     statement1
          Statement to be executed if expr is true. If several statements
          are to be executed, they have to be grouped together in curly
          braces { }.

     statement2
          Statement to be executed if expr is false. If several statements
          are to be executed, they have to be grouped together in curly
          braces { }.

     Example:

     If field 1 is greater than field 2, fields 2 and 3 are printed; if
     not, fields 4 and 5 are printed:

          { if($1 > 2) print $2, $3; else print $4, $5 }

   next - Skip to the next input record

     The next statement causes awk to suspend processing of the current
     record; statements that follow next are not applied to the current
     record. awk then reads the next input record. NR, NF, FNR, $0, and $1
     to $NF are reset.

     Difference between next and the getline function:

     getline sets the current record to the next one. Statements that fol-
     low getline are executed using the next record's values for the $
     variables and for NR, NF, and FNR.

     ______________________________________________________________________

     next
     ______________________________________________________________________

     Example:

     Records that begin with a dot are ignored:

              { if ($1 ~ /^\./) next }











Page 29                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   return - Return from a function

     The body of a function definition may contain a return statement that
     return control and perhaps a value to the caller.

     ______________________________________________________________________

     return [expression]
     ______________________________________________________________________

     Example:

     This function computes the maximum of its arguments:

              function max(m, n) {
                 return m > n ? m : n
          }

          The variables m and n belong to the function max; they are unre-
          lated to any other variables of the same names elsewhere in the
          program.

   while - Execute iteratively

     The statement in a while loop is executed iteratively while a speci-
     fied condition continues to be satisfied.

     ______________________________________________________________________

     while(expr) statement
     ______________________________________________________________________

     expr Expression (see Expressions above) that specifies the condition.

     statement
          Statement that is executed in each iteration of the loop. If
          several statements are to be executed, they have to be grouped
          together in curly braces { }.

     Example:

     Print all input fields, writing each field in a separate output line:

        { i = 1;
          while (i <= NF) {
              print $i
              i++
          }
        }





Page 30                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

FUNCTIONS
     awk provides a wide range of built-in functions and also offers you
     the option of defining functions of your own:
     ______________________________________________________________________

     function name(arg, ...) {statements}
     ______________________________________________________________________

     The {statements} may be preceded by a newline character. There may
     also be blank lines within the curly braces { }. A function definition
     has the same precedence as pattern {action} pairs in the main section
     of an awk program.

     Within an action section, function calls can be entered anywhere in an
     expression, except before the function declaration. There must be no
     space between the function name and the left parenthesis when a func-
     tion is called. Nested and recursive function calls are legal.

     Though most functions do not require you to enclose arguments in
     parentheses, it is a good practice to use them as a means of increas-
     ing program transparency. When you pass an array as an argument, a
     pointer to the array is passed (call by reference), which means that
     you can change the elements of the array from the function. In the
     case of scalar variables, the value of the variable is copied and
     passed (call by value), which means that you cannot change the value
     of the variable from the function. The scope of function arguments is
     restricted to the local function, whereas the scope of all other vari-
     ables is always global. If you need a local variable in a function,
     define it at the end of the argument list in the function definition.
     Any variable in the argument list for which no current argument exists
     is a local variable with a predefined value of 0.

     As in C, some functions return a result (e.g. exp), while others are
     procedural in character (e.g. output functions). The return statement
     can be used with or without a return value or may be omitted entirely.
     In the latter case, the return value would be undefined if it were to
     be accessed.

     Example:

     In the example below, the function named search looks for the string
     who in the array allnames and returns the index or -1. The third argu-
     ment, incr, is used as a local variable.











Page 31                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

           ...
        { print $1, search($1, allnames) }
           ...
        function search(who, allnames, incr)
        {
           for (incr=0; allnames[incr]; incr++)
              if (index(allnames[incr], who) == 1
                  && length(allnames[incr]) == length(who))
                     return incr
           return -1
        }

     Built-in functions
     _________________________________________________________________________
    | Function                  |  Purpose                                   |
    |________________________________________________________________________|
    | Input function                                                         |
    |___________________________|____________________________________________|
    | getline                   |  Read input record                         |
    |________________________________________________________________________|
    | Output functions                                                       |
    |___________________________|____________________________________________|
    | print([arg,...])          |  Standard output function                  |
    |___________________________|____________________________________________|
    | printf(format [arg,...])  |  Formatted output                          |
    |________________________________________________________________________|
    | Arithmetic functions                                                   |
    |___________________________|____________________________________________|
    | atan2(y,x)                |  Arc tangent of y/x                        |
    |___________________________|____________________________________________|
    | cos(x)                    |  Cosine                                    |
    |___________________________|____________________________________________|
    | exp(x)                    |  Exponential function                      |
    |___________________________|____________________________________________|
    | int(x)                    |  Truncate to integer                       |
    |___________________________|____________________________________________|
    | log(x)                    |  Natural logarithm                         |
    |___________________________|____________________________________________|
    | rand()                    |  Return a random number                    |
    |___________________________|____________________________________________|
    | sin(x)                    |  Sine                                      |
    |___________________________|____________________________________________|
    | sqrt(x)                   |  Square root                               |
    |___________________________|____________________________________________|
    | srand([x])                |  Set the seed (initial value) for rand()   |
    |___________________________|____________________________________________|








Page 32                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     _________________________________________________________________________
    | String functions                                                       |
    |___________________________|____________________________________________|
    | gsub(RE,repl[,instr])     |  Global substitution function              |
    |___________________________|____________________________________________|
    | index(str1,str2)          |  Return first occurrence of substring      |
    |___________________________|____________________________________________|
    | length([str])             |  Return length of string                   |
    |___________________________|____________________________________________|
    | match(str,RE)             |  Check whether string str matches regular  |
    |                           |  expression RE                             |
    |___________________________|____________________________________________|
    | split(str,array,[sep])    |  Subdivide string                          |
    |___________________________|____________________________________________|
    | sprintf(format,e1,e2,...) |  Return formatted output as string         |
    |___________________________|____________________________________________|
    | sub(RE, repl[,instr])     |  Substitution function                     |
    |___________________________|____________________________________________|
    | substr(str,m,[n])         |  Define substring                          |
    |___________________________|____________________________________________|
    | tolower(zk)               |  Conversion to lowercase letters           |
    |___________________________|____________________________________________|
    | toupper(zk)               |  Conversion to uppercase letters           |
    |________________________________________________________________________|
    | General functions                                                      |
    |___________________________|____________________________________________|
    | close(expr)               |  Close file or pipe                        |
    |___________________________|____________________________________________|
    | system(expr)              |  Call shell command                        |
    |___________________________|____________________________________________|

     The following section describes each of these functions in alphabeti-
     cal order together with the associated arguments. The argument you
     specify can either be a constant or an expression (see Expressions
     above). awk first evaluates the expression arguments and then applies
     the function to the computed results.

   atan2 - Arc tangent

     atan2 calculates the arc tangent of the quotient of two numbers.
     atan2(y,x) returns the arc tangent of y/x.

     ______________________________________________________________________

     atan2(y,x)
     ______________________________________________________________________

     y,x  Numbers that produce the quotient for which the arc tangent is to
          be calculated.





Page 33                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   close - Close file or pipe

     close closes the specified file or pipe.

     ______________________________________________________________________

     close(expr)
     ______________________________________________________________________

     expr Name of the file or pipe to be closed (see the section on
          "Redirecting output" under the descriptions of the print and
          printf functions).

   cos - Cosine

     cos calculates the cosine of a number.

     ______________________________________________________________________

     cos(x)
     ______________________________________________________________________

     x    Number for which the cosine is to be calculated.

   exp - Exponential function

     exp calculates e to the power of x.

     ______________________________________________________________________

     exp(x)
     ______________________________________________________________________

     x    Number for which ex is to be computed.

   getline - Read a record

     awk retrieves a record as directed (see also the control-flow state-
     ment next).

     getline has several different formats, with the following return
     values:

     1   successful execution

     0   end-of-file

     -1  error






Page 34                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     ______________________________________________________________________

     getline
     ______________________________________________________________________

     awk reads the next input record from the input file into $0. NR, NF,
     FNR, $0, and $1 to $NF are reset.

     Example:

     If a record contains %%%, the next record is read. In other words,
     input records containing %%% are ignored.

               /%%%/ {getline}

     ______________________________________________________________________

     getline < file
     ______________________________________________________________________

     awk reads a record from the named file into $0. NF, $0, and $1 to $NF
     are reset.

     file Name of the file from which a record is to be read.

     ______________________________________________________________________

     getline var
     ______________________________________________________________________

     awk fetches the next input record from the input file and puts it into
     the variable var. NR and FNR are reset.

     var  Variable into which the next record is to be read.

     ______________________________________________________________________

     getline var < file
     ______________________________________________________________________

     awk fetches a record from the named file and puts it into the variable
     var. NR, NF, FNR, $0, and $1 to $NF remain unchanged.

     var  Variable into which the record is to be read.

     file Name of the file from which the record is to be read.








Page 35                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     ______________________________________________________________________

     command | getline [var]
     ______________________________________________________________________

     The output of the named command is redirected to getline. Each getline
     call in this format causes awk to read the next line from the output
     of command and write it into $0 or the variable var.

     If var is specified, NR, NF, FNR, $0, and $1 to $NF remain unchanged;
     if not, NF, $0, and $1 to $NF are reset.

     This construct is equivalent to calling the C function popen(3S) with
     mode r.

     var  Variable into which the record is to be written.

          var not specified:

          The record is written into $0.

     command
          Name of the command whose output is to be read.

   gsub - Global substitution function

     gsub globally substitutes the string repl for all strings in $0 or
     instr that match the extended regular expression RE.

     gsub returns the number of substitutions.

     ______________________________________________________________________

     gsub(RE,repl[,instr])
     ______________________________________________________________________

     RE   Extended regular expression that specifies the pattern to be
          matched.

     repl String to be substituted for the strings that match RE.

     instr
          String in which the substitution is to be made.

          instr not specified:

          Substitution is done in $0.







Page 36                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   index - Search for substrings

     index searches for a substring within a string. If the substring is
     present, index returns the starting character position (numbered from
     1 onward) of its first occurrence in the string; if not, it returns a
     value of 0.

     ______________________________________________________________________

     index(str1,str2)
     ______________________________________________________________________

     str1 String in which index looks for the substring.

     str2 Substring that index looks for.

     Example:

          index("PaPa-MaMa","Pa")

     returns 1.

   int - Truncate to integer

     int returns the largest integer equal to or smaller than the argument.

     ______________________________________________________________________

     int(x)
     ______________________________________________________________________

     x    Number that is to be truncated to its integer part.

   length - Return length

     length returns the length of a string.

     ______________________________________________________________________

     length[(str)]
     ______________________________________________________________________

     str  length returns the length of string str.

          str not specified:

          length returns the length of the current input record $0.







Page 37                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   log - Logarithm

     log calculates the natural base e logarithm.

     ______________________________________________________________________

     log(x)
     ______________________________________________________________________

     x    Number whose natural log is to be computed.

   match - Match regular expressions

     match checks whether a string in str matches the extended regular
     expression in RE.

     If a matching string is found, match returns the character position in
     str (numbered from 1 onward) at which the string begins; if not, it
     returns 0.

     The variable RSTART is set to the return value of match; RLENGTH is
     set to the length of the matching string (or -1 if no matching string
     is found).

     ______________________________________________________________________

     match(str,RE)
     ______________________________________________________________________

     str  String in which the pattern is to be matched.

     RE   Extended regular expression.

   print - Standard output function

     print is the standard output function. print outputs either the
     current record or the specified arguments and terminates its output
     with the output record separator ORS. For further details refer to
     Output format below.

     ______________________________________________________________________

     print(arg1[[,]arg2]...)[redirection]
     ______________________________________________________________________

     No argument specified:
          print writes the current input record on standard output.







Page 38                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     arg1arg2
          Arguments that are to be printed.

          print evaluates the expression arguments and concatenates the
          results in the order in which the arguments are specified.

     arg1,arg2
          Arguments that are to be printed. print outputs the evaluated
          expression arguments in the specified order, separated by the
          output field separator OFS if they are separated by commas in the
          print statement.

     redirection
          Output can be redirected to a file or piped to a program. You can
          use up to 10 output files.

          redirection can be in the form of: >, >>, name of program.

          >file
               The output is written to the named file. The former contents
               of file are deleted the first time print is called. All sub-
               sequent print or printf outputs to file in the same awk pro-
               gram are appended to the end of file. Unless explicitly
               closed, file remains open until the end of the awk program.

          >>file
               The output is appended to the previous contents of file.
               Unless explicitly closed, file remains open until the end of
               the awk program.

          |prog
               The output is piped to the program named prog.

               You are only permitted to open one pipe to prog within an
               awk program, but you can pipe any number of print or printf
               outputs to it.

               This construct is equivalent to calling the C function
               popen(3S) with mode w.

               Unless explicitly closed, the pipe remains open until the
               end of the awk program.

               The file or program name can specified directly (enclosed in
               "...") or via a variable that evaluates to the filename.

               Caution:

               If you redirect output to the input file, the input file
               will be destroyed without any warning.




Page 39                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Output format

     print outputs integers in decimal and prints strings at full length.
     Apart from that, the output format is contingent on the following
     predefined variables:

     OFS - output field separator
          OFS is one space by default. If you wish, you can assign any one
          character to OFS to change the output field separator.

     ORS - output record separator
          ORS is the newline character by default. If you wish, you can
          assign any one character to ORS to change the output record
          separator.

     OFMT - floating point output format
          OFMT defines the output format for floating point values and is
          set to %.6g by default. This means that the fractional part of a
          floating point number is printed with a maximum of 6 places. If
          you wish, you can assign a different printf format for floating
          point numbers to OFMT (see printf below).

     Example 1

     Print the first and second fields, separated by a blank:

               {print $1,$2}

     Example 2

     Concatenate the first and second fields without an output field
     separator:

               {print $1$2}   or   {print $1 $2}




















Page 40                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   printf - Formatted output

     printf is the output function for formatted output. The output format
     can be specified as in the standard printf(3S) function in C.

     ______________________________________________________________________

     printf(format,arg,...)[redirection]
     ______________________________________________________________________

     format
          String defining the output format. The output format comprises
          plain characters and format elements (conversion specifications).
          Printable characters are output unaltered. The metacharacters
          listed in the "Basic elements" section above are converted
          immediately. For example, \n sets the position to the start of
          the next line.

          All format elements begins with the percent sign. Specification
          of a digit after the percent sign defines the number of places.
          The most common format elements are presented in the following
          table:
          _________________________________________________________________
         | Format element |  Meaning                                      |
         |________________|_______________________________________________|
         | %c             |  single character                             |
         |________________|_______________________________________________|
         | %d, %i         |  decimal integer                              |
         |________________|_______________________________________________|
         | %e, %E         |  floating point number in exponential nota-   |
         |                |  tion, e.g. 5.234e+2                          |
         |________________|_______________________________________________|
         | %f             |  floating point number, e.g. 52.34            |
         |________________|_______________________________________________|
         | %g, %G         |  %e or %f, whichever is shorter               |
         |________________|_______________________________________________|
         | %o             |  octal integer (base 8)                       |
         |________________|_______________________________________________|
         | %s             |  character string                             |
         |________________|_______________________________________________|
         | %u             |  unsigned decimal integer                     |
         |________________|_______________________________________________|
         | %x, %X         |  hexadecimal integer (base 16)                |
         |________________|_______________________________________________|

     arg  Arguments that are to be printed.

          printf evaluates the expression arguments, allocates them in the
          given order to the specifications in format, and outputs them in
          the appropriate format.




Page 41                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

          -  If the format element is incompatible with the argument, e.g.
             a numeric format specification for an alphanumeric argument, a
             0 is printed.

          -  If there are more arguments than format elements, the excess
             arguments are ignored, i.e. not printed.

          -  If there are more format elements than arguments, an error
             message is issued.

     redirection
          Redirection is as for print.

          redirection not specified:

          printf prints on standard output.

     Example:

     Field 1 is printed as a decimal number with at least 2 positions, fol-
     lowed by ** as a separator, followed by field 2 as a string of at
     least 5 characters, followed by newline:

              { printf("%2d**%5s\n", $1,$2) }

   rand - Return a random number

     rand returns a random number r, where 0 <= r < 1.

     ______________________________________________________________________

     rand
     ______________________________________________________________________

     Also refer to srand.

   sin - Sine

     sin returns the sine of a number.

     ______________________________________________________________________

     sin(x)
     ______________________________________________________________________

     x    Number whose sine is to be computed.








Page 42                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   split - Subdivide strings

     split divides a string into substrings and stores each substring as an
     element in an array. The elements are subscripted in ascending order,
     starting with 1.

     split returns the number of array elements.

     ______________________________________________________________________

     split(str,array[,sep])
     ______________________________________________________________________

     str  String that is to be split.

     array
          Name of the resulting array.

     sep  Extended regular expression specifying the characters that act as
          a separator between the substrings in str.

          sep not specified:

          FS is used as the separator.

     Example:

       {
            s=split("january:february:march", months, ":");
            for(i=1; i<=s; i++) print months[i];
       }

     produces the output

       january
       february
       march

   sprintf - Return formatted output as a string

     sprintf formats in exactly the same way as printf, but there is no
     direct output. sprintf instead returns the formatted output as a
     string, which could then be assigned to a variable or used for a simi-
     lar purpose.

     ______________________________________________________________________

     sprintf(format,arg,...)
     ______________________________________________________________________





Page 43                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     format
          String defining the output format (see printf).

     arg  Arguments that are to be output (see printf).

     Example:

     The following awk program fragment produces the same output as the
     example given under printf.

             { x = sprintf("%2d**%5s\n", $1,$2); print x }

   sqrt - Calculate the square root

     sqrt calculates the square root of a number.

     ______________________________________________________________________

     sqrt(x)
     ______________________________________________________________________

     x    Number whose square root is to be computed.

   srand - Set the seed for the rand function

     srand sets the seed (starting point) for the rand function to the
     number x, or to the current time if no argument is specified.

     ______________________________________________________________________

     srand([x])
     ______________________________________________________________________

     x    Number that is to serve as the seed for rand.

   sub - Substitution function

     sub substitutes the string repl for the first instance of a string in
     $0 or instr that matches the extended regular expression RE.

     sub returns the number of substitutions.

     ______________________________________________________________________

     sub(RE,repl[,instr])
     ______________________________________________________________________

     RE   Extended regular expression that specifies the pattern to be
          matched.





Page 44                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     repl String to be substituted for the strings that match RE.

     instr
          String in which the substitution is to be made.

          instr not specified:

          The substitution is done in $0.

   substr - Define a substring

     substr extracts a substring from a string.

     ______________________________________________________________________

     substr(str,m[,n])
     ______________________________________________________________________

     str  String from which the substring is to be extracted.

     m    Position in str at which the substring begins. Character posi-
          tions are numbered consecutively from left to right, starting
          with one.

     n    Maximum length of the substring.

          n not specified:

          The substring extends to the end of str.

     Example:

        {
           x = substr("060789",3,2); print "Month = "x
        }

     produces the output:

        Month = 07















Page 45                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

   system - Call shell command

     system executes the specified shell command and returns its exit
     status.
     ______________________________________________________________________

     system(command)
     ______________________________________________________________________

     command
          Name of the shell command to be executed.

   tolower - Convert to lowercase letters

     tolower converts all uppercase letters in a character string to lower-
     case letters.
     ______________________________________________________________________

     tolower(zk)
     ______________________________________________________________________

     zk   Character string to be converted to lowercase letters

   toupper - Convert to uppercase letters

     toupper converts all lowercase letters in a character string to upper-
     case letters.
     ______________________________________________________________________

     toupper(zk)
     ______________________________________________________________________

     zk   Character string to be converted to uppercase letters

DIAGNOSTICS
     If an awk program contains errors, awk issues corresponding error mes-
     sages and exits immediately. The error messages indicate the cause of
     the error, if detectable by awk, and the awk program line in which awk
     thinks the error is to be found. Typical error messages are:

     awk: syntax error at source line xxx

     awk: illegal statement source line number xxx











Page 46                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

LOCALE
     The LCMESSAGES environment variable governs the language in which
     message texts are displayed.

     In regular expressions in square brackets, the LCCOLLATE environment
     variable governs the scope of character ranges, equivalence classes
     and collating elements, and the

     LCCTYPE environment variable governs the scope of character classes.
     LCCOLLATE governs the behavior of relational operators in string com-
     parisons. LCTYPE governs the behavior of toupper/tolower conversions.

     LCNUMERIC governs the representation of the radix character, the
     exponentiation symbol and the digit grouping character for output and
     internal conversions, though not for the specification of values
     within an awk program or the assignment of variables in arguments.

     If LCMESSAGES, LCCOLLATE, LCCTYPE or LCNUMERIC is undefined or is
     defined as the null string, it defaults to the value of LANG. If LANG
     is likewise undefined or null, the system acts as if it were not
     internationalized.

     If any of the locale variables has an invalid value, the system acts
     as if none of the variables were set.

     The LCALL environment variable governs the entire locale. LCALL
     takes precedence over all the other environment variables which affect
     internationalization.

EXAMPLES
     Example 1

     Output all input lines in which field 3 is greater than field 5:

     $ awk '$3 > $5' file

     Since no action has been specified, awk prints the selected lines by
     default.

     Example 2

     Print every 10th line of a file:

     $ awk '(NR % 10) == 0' file










Page 47                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Example 3

     Print the second to last and the last field in each line, separated by
     a colon:

     $ awk 'BEGIN {OFS=":"}
     >      {print $(NF-1), $NF} ' file

     If a line consists of a single field, the entire line is output twice,
     separated by a colon (first $0, then $1).

     Example 4

     Add up the values of the first field of every line and print the total
     and average at the end.

     $ awk '{s += $1}
     >      END {print "Total: ", s, "Average: ", s/NR}'
     >      file

     Example 5

     Find a preprocessor if directive, i.e. a range of lines in which the
     first line begins with #if and the last line with #endif.

     $ awk '/^#if/, /^#endif/' file

     Example 6

     Print all lines in which the first field differs from that of the pre-
     vious line:

     $ awk '$1 != prev { print; prev = $1 } ' file

     Example 7

     file contains a list of data about young people, with the second field
     containing one of the entries school, university, apprenticeship or
     elsewhere. For statistical purposes, you want to count how many are at
     school and university:

     $ awk '$2 ~ /school/ {incr["school"]++}
     >     $2 ~ /university/ {incr["university"]++}
     >     END {print "school:" incr["school"];
     >          print "university:" incr["university"]} ' file









Page 48                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Example 8

     The file contents lists the table of contents of a text. The table of
     contents is organized in decimal classification and has the format:

     1. Foreword
     2. Introduction
     3. The Game of Chess
     3.1. History
     3.2. Rules
     3.2.1 Setting Up the Figures
     .
     .
     .
     4. The Game of Checkers/Draughts
     4.1. History
     .
     .
     .
     8. Index

     The following awk program can be used to give the list a more orderly
     format:

     $ awk '{$1=$1"      ";
     >      $1=substr($1,1,6);
     >      print $0} ' contents >> con.form

     The output lines are prepared in the following stages:

     First, six blanks are added to the end of the first field
     ($1=$1"<blank><blank><blank><blank><blank><blank>"). Then the first
     field is truncated to six characters. Thus the first field of each
     line is 6 characters long, and field 2 always starts at column 7. The
     output in the file con.form will be as follows:

     1.    Foreword
     2.    Introduction
     3.    The Game of Chess
     3.1.  History
     3.2.  Rules
     3.2.1 Setting Up the Figures
     .
     .
     .
     4.    The Game of Checkers/Draughts
     4.1.  History
     .
     .
     .
     8.    Index



Page 49                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     Example 9

     The following awk program in the file prog prints the number of fields
     and the actual fields of each record. The record separator has been
     redefined as the dollar sign. The field separators are thus blanks,
     tabs, and the newline character:

     BEGIN { RS="$"; printf "Record" }
           { printf ("\n%4d3d
             for(i=1;i<=NF; i++) printf "%s:", $i }
     END {print"\n"}

     The file text contains the following text:

     first record$  second   record     $
     $
     fourth     and  last
     record$

     The call:

     $ awk -f prog text

     returns:

     Record  Num
          1    2      first:record:
          2    2      second:record:
          3    0
          4    4      fourth:and:last:record:
          5    0

     Example 10

     You now change the file text to:

     &&
     first&&record$second record$$fourth
     and&
     last
     record&

     and call awk again, this time using the -F option to change the field
     separator to &.

     $ awk -F"&" -f prog text








Page 50                      Reliant UNIX 5.44                Printed 11/98

awk(1)                                                               awk(1)

     The output returned is:

     Record  Num
          1    6    :::first::record:
          2    1    second record:
          3    0
          4    8    fourth:and::last::record:::

     This example illustrates how fields are separated when a non-standard
     separator is used. The first line (&&) of the text file is a part of
     the first record and now yields 3 fields, for example, because each
     individual separator in a string of separators (&&) is counted, and
     the newline implicitly acts as a separator as well (2 & + 1 newline =
     3).

SEE ALSO
     egrep(1), fgrep(1), grep(1), lex(1), sed(1), expressions(5).

     Kernighan, B. W.; Pike, R.: The UNIX Programming Environment.

     Tare, R. S.: UNIX Utilities.

     Aho, A.; Kernighan, B. W.; Weinberger, P.: The AWK Programming
     Language.






























Page 51                      Reliant UNIX 5.44                Printed 11/98

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026