Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

yacc(1)

malloc(3C)



LEX(1)                   DOMAIN/IX SYS5                    LEX(1)



NAME
     lex - generate programs for simple lexical tasks

USAGE
     lex [ -rctvn ] [ file ] ...

DESCRIPTION
     Lex generates programs for simple lexical analysis of text.

     The input files contain strings and expressions to be
     searched for, and C text to be executed when strings are
     found.  Lex treats multiple files as a single file.  If no
     files are specified, it uses standard input.

     Lex generates a C source program named lex.yy.c which, when
     loaded with the library, copies the input to the output,
     except where it encounters a specified string in the file
     being analyzed.  The program then executes corresponding
     program text.  The matching string remains in yytext, an
     external character array.  Matching is done in the order
     that the strings appear in the file.

RULES
     Strings may contain square brackets to indicate character
     classes, as in [ abx-z ] to indicate a, b, x, y, and z.

     The operators *, +, and ?  respectively signify any non-
     negative number of, any positive number of, and either zero
     or one occurrence of, the previous character or character
     class.

     The period (.) is the class of all ASCII characters except
     newline.

     Using parentheses for grouping and a vertical bar for alter-
     nation is allowed.

     The notation r{d,e} in a rule indicates between d and e
     instances of regular expression r.  It has higher precedence
     than a pipe (|), but lower precedence than the *, ?, and +
     characters, and concatenation.

     The caret (^) at the beginning of an expression permits a
     successful match only immediately after a newline.  A dollar
     sign ($) at the end of an expression requires a trailing
     newline.







Printed 12/4/86                                             LEX-1







LEX(1)                   DOMAIN/IX SYS5                    LEX(1)



     The slash (/) in an expression indicates trailing context;
     only the part of the expression up to the slash is returned
     in yytext, but the remainder of the expression must follow
     in the input stream.

     An operator character may be used as an ordinary symbol if
     it is within double quotes (`` '') or preceded by a
     backslash (\).  Thus, [a-zA-Z]+ matches a string of letters.

     Three subroutines defined as macros are expected.  They are:

     input()   reads a character

     unput(c)  replaces a character read

     output(c) places an output character

     These subroutines are defined in terms of the standard
     streams, but you can override them.  The program generated
     is named yylex(), and the library contains a main() that
     calls it.  The action ``REJECT'' on the right side of the
     rule causes this match to be rejected and the next suitable
     match executed.  The function yymore() accumulates addi-
     tional characters into the same yytext.  The function
     yyless(p) pushes back the portion of the string matched
     beginning at p, which should be between yytext and
     yytext+yyleng.  The macros input and output use files yyin
     and yyout to read from and write to, defaulted to stdin and
     stdout, respectively.

     Any line beginning with a blank is assumed to contain only C
     text and it is copied.  If it precedes %%, it is copied into
     the external definition area of the lex.yy.c file.  All
     rules should follow a double percent sign (%%) as in
     yacc(1).  Lines preceding %% and beginning with a nonblank
     character define the string on the left to be the remainder
     of the line; it can be called out later by surrounding it
     with braces ({ }).  Note that braces do not imply
     parentheses; only string substitution is done.

     The external names generated by lex all begin with the pre-
     fix yy or YY.

     Certain table sizes for the resulting finite state machine
     can be set in the definitions section:

          %p n number of positions is n (default 2000)

          %n n number of states is n (500)




LEX-2                                             Printed 12/4/86







LEX(1)                   DOMAIN/IX SYS5                    LEX(1)



          %t n number of parse tree nodes is n (1000)

          %a n number of transitions is n (3000)

     Using one or more of the above automatically implies the -v
     option, unless the -n option is used.

OPTIONS
     -r        Specify RATFOR actions.

     -c        Indicate C actions (default).

     -t        Write the result of the lexical analysis on stan-
               dard output, instead of in file lex.yy.c
               (default).

     -v        Provide a one-line summary of statistics of the
               generated analyzer.

     -n        Suppress printing of the one-line summary men-
               tioned in the -v option (default).

EXAMPLE
             D       [0-9]
             %%
             if      printf("IF statement\n");
             [a-z]+  printf("tag, value %s\n",yytext);
             0{D}+   printf("octal number %s\n",yytext);
             {D}+    printf("decimal number %s\n",yytext);
             "++"    printf("unary op\n");
             "+"     printf("binary op\n");
             "/*"    {       loop:
                             while (input() != '*');
                             switch (input())
                                     {
                                     case '/': break;
                                     case '*': unput('*');
                                     default: go to loop;
                                     }
                             }

CAUTIONS
     The -r option is not yet fully operational.

RELATED INFORMATION
     yacc(1), malloc(3C).







Printed 12/4/86                                             LEX-3





Typewritten Software • bear@typewritten.org • Edmonds, WA 98026