Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

regcmp(1)

ed(1)

malloc(3)



  regcmp(3)                           CLIX                           regcmp(3)



  NAME

    regcmp, regex - Compiles and executes regular expression

  LIBRARY

    Programmer's Workbench (libPW.a)

  SYNOPSIS

    char *regcmp(
      char *string1, *string2, ... , (char *)0 );

    char *regex(
      char *re, *subject, *ret0, ...  );

    extern char *__loc1

  PARAMETERS

    re     A pointer to the compiled regular expression .(pattern).

    ret0, ...
           A pointer to values which were matched.

    string1, string2, ... ,
           Each parameter is a pointer to a regular expression.  These
           arguments will be concatenated and the resulting regular expression
           will be compiled.  The last string must be a NULL pointer.

    subject
           A pointer to string upon which the regular expression is executed.

  DESCRIPTION

    The regcmp() function compiles a regular expression (consisting of the
    concatenated arguments) and returns a pointer to the compiled form.  The
    malloc() function is used to create space for the compiled form.  It is
    the user's responsibility to free unneeded space so allocated.  A return
    from regcmp() indicates an incorrect argument.  The regcmp() function is
    written to generally preclude the need for this function at execution
    time.

    The regex() function executes a compiled pattern against the subject
    string.  Additional arguments are passed to receive values back.  The
    regex() returns on failure or a pointer to the next unmatched character on
    success.  A global character pointer __loc1 points to where the match
    began.  The regcmp() and regex() functions were mostly borrowed from the
    editor, ed(); however, the syntax and semantics have been changed
    slightly.




  2/94 - Intergraph Corporation                                              1






  regcmp(3)                           CLIX                           regcmp(3)



    The following is a list of the valid symbols along with their meanings:

    []*.^       These symbols retain their meaning in ed().

    $           Matches the end of the string; \n matches a newline.

    -           Within brackets the minus means through.  For example, [a-z]
                is equivalent to [abcd ... xyz].  The - can appear as itself
                only if used as the first or last character.  For example, the
                character class expression []-] matches the characters ] and
                -.

    +           A regular expression followed by + means one or more times.
                For example, [0-9]+ is equivalent to [0-9][0-9]*.

    {m}
    {m,}
    {m,u}       Integer values enclosed in {} indicate the number of times the
                preceding regular expression is to be applied.  The value m is
                the minimum number and u is a number, less than 256, which is
                the maximum.  If only m is present (for example: {m}), it
                indicates the exact number of times the regular expression is
                to be applied.  The value {m,} is analogous to {m,infinity}.
                The plus + and the star * operations are equivalent to {1,}
                and {0,}, respectively.

    ( ... )$n   The value of the enclosed regular expression is to be
                returned.  The value is stored in the (n+1)th argument
                following the subject argument.  At most, ten enclosed regular
                expressions are allowed.  The regex() function makes its
                assignments unconditionally.

    ( ... )     Parentheses are used for grouping.  An operator; for example,
                *, +, or {} can work on a single character or a regular
                expression enclosed in parentheses.  For example:
                (a*(cb+)*)$0.

    All the above defined symbols are special; therefore, to be used, they
    must be escaped with a backslash (\).

  EXAMPLES

    1.  The following matches a leading newline in the subject string pointed
        at by cursor:

        char *cursor, *newcursor, *ptr;

        newcursor = regex((ptr = regcmp("^\n", (char *)0)),
                           cursor);
        free(ptr);




  2                                              Intergraph Corporation - 2/94






  regcmp(3)                           CLIX                           regcmp(3)



    2.  The following matches through the string Testing3 and returns the
        address of the character after the last matched character (the 4):

        char ret0[9];
        char *newcursor, *name;

        name = regcmp("([A-Za-z][A-Za-z0-9]{0,7})$0", (char *)0);
        newcursor = regex(name, "012Testing345", ret0);

        The string Testing3 is copied to the character array ret0.

    3.  The following applies a precompiled regular expression in <file.i>
        against string:

        #include <file.i>
        char *string, *newcursor;

        newcursor = regex(name, string);


  CAUTIONS

    The user program may run out of memory if regcmp() is called iteratively
    without freeing the vectors no longer required.

  RETURN VALUES

    regcmp()   A NULL indicates an incorrect argument was received by
               regcmp(); otherwise, a pointer to the compiled form of the
               input regular expression is returned.

    regex()    A NULL indicates failure to match; otherwise, a pointer to the
               next unmatched character is returned.

  RELATED INFORMATION

    Commands:  regcmp(1), ed(1)

    Functions:  malloc(3)















  2/94 - Intergraph Corporation                                              3




Typewritten Software • bear@typewritten.org • Edmonds, WA 98026