Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

ed(1)

regcmp(1)

malloc(3C)

REGCMP(3X)  —  HP-UX

NAME

regcmp, regex − compile and execute regular expression

SYNOPSIS

char ∗regcmp (string1 [, string2, ...], (char ∗)0)
char ∗string1, ∗string2, ...;

char ∗regex (re, subject[, ret0, ...])
char ∗re, ∗subject, ∗ret0, ...;

extern char ∗__loc1;

DESCRIPTION

Regcmp compiles a regular expression and returns a pointer to the compiled form.  Malloc(3C) is used to create space for the vector. It is the user’s responsibility to free unneeded space so allocated. A NULL return from regcmp indicates an incorrect argument.  Regcmp(1) has been written to generally preclude the need for this routine at execution time.

Regex executes a compiled pattern against the subject string.  Additional arguments are passed to receive values back.  Regex returns NULL on failure or a pointer to the next unmatched character on success.  A global character pointer __loc1 points to where the match began.  Regcmp and regex were mostly borrowed from the editor, ed(1); however, the syntax and semantics have been changed slightly. The following are the valid symbols and their associated meanings.

[]*.^ These symbols retain their current meaning. 

$ Matches the end of the string; \n matches a new-line. 

− Within brackets the minus means through. For example, [a−z] is equivalent to [abcd...xyz].  The − can appear as itself only if used as the first or last character.  For example, the character class expression []−] matches the characters ] and −. 

+ A regular expression followed by + means one or more times. For example, [0−9]+ is equivalent to [0−9][0−9]∗. 

{m} {m,} {m,u}
Integer values enclosed in {} indicate the number of times the preceding regular expression is to be applied.  The value m is the minimum number and u is a number, less than 256, which is the maximum.  If only m is present (e.g., {m}), it indicates the exact number of times the regular expression is to be applied.  The value {m,} is analogous to {m,infinity}.  The plus (+) and star (∗) operations are equivalent to {1,} and {0,} respectively. 

( ... )$n The value of the enclosed regular expression is to be returned.  The value will be stored in the (n+1)th argument following the subject argument. At most ten enclosed regular expressions are allowed. Regex makes its assignments unconditionally. 

( ... ) Parentheses are used for grouping.  An operator, e.g., ∗, +, {}, can work on a single character or a regular expression enclosed in parentheses.  For example, (a∗(cb+)∗)$0. 

By necessity, all the above defined symbols are special.  They must, therefore, be escaped to be used as themselves. 

EXAMPLES

Example 1:

char ∗cursor, ∗newcursor, ∗ptr;
...
newcursor = regex((ptr = regcmp("^\n", 0)), cursor);
free(ptr);

This example will match a leading new-line in the subject string pointed at by cursor. 

Example 2:

char ret0[9];
char ∗newcursor, ∗name;
...
name = regcmp("([A−Za−z][A−za−z0−9_]{0,7})$0", 0);
newcursor = regex(name, "123Testing321", ret0);

This example will match through the string “Testing3” and will return the address of the character after the last matched character (cursor+11).  The string “Testing3” will be copied to the character array ret0.

Example 3:

#include "file.i"
char ∗string, ∗newcursor;
...
newcursor = regex(name, string);

This example applies a precompiled regular expression in file.i (see regcmp(1)) against string.

This routine is kept in /lib/libPW.a. 

SEE ALSO

ed(1), regcmp(1), malloc(3C). 

BUGS

The user program may run out of memory if regcmp is called iteratively without freeing the vectors no longer required.  The following user-supplied replacement for malloc(3C) reuses the same vector saving time and space:

/∗ user’s program ∗/
...
char ∗
malloc(n)
unsigned n;
{
static char rebuf[512];
return (n <= sizeof rebuf) ? rebuf : NULL;
}

Hewlett-Packard Company  —  Version B.1,  May 11, 2021

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026