STRFILE(6) BSD STRFILE(6)
NAME
strfile, unstr - create a random access file for storing strings
SYNOPSIS
strfile [options] sourcefile [datafile]
unstr [ -o ] [ -cx ] datafile [outfile]
DESCRIPTION
strfile converts a file containing a set of strings into a data file
which contains those strings along with a seek pointer table to the
beginning of each. This allows random access to the strings. strfile's
main use is to add entries to the fortune(6) database.
strfile creates datafile from a sourcefile that consists of strings
separated by lines starting with "%%" or "%-". Anything following these
characters on the line will be ignored, so comments can be placed on
these lines. A "%%" simply separates strings; a "%-" separates not only
strings but sections. A file can have up to four sections (in other
words, up to three delimiters). This can be used in a program-defined
way.
If you do not specify a datafile on the command line, strfile creates a
file named sourcefile.dat. The datafile contains a header describing its
contents, the seek pointers to the beginning of each string, and the
strings themselves (terminated by null bytes).
The format of the header is
# define MAXDELIMS 3
# define STR_RANDOM 0x1
# define STR_ORDERED 0x2
typedef struct {
unsigned long str_numstr; /* # of strings in the file */
unsigned long str_longlen; /* length of longest string */
unsigned long str_shortlen; /* length of shortest string */
long str_delims[MAXDELIMS]; /* delimiter markings */
off_t str_dpos[MAXDELIMS]; /* delimiter positions */
short str_flags; /* bit field for flags */
} STRFILE;
The values in str_delims are the indices of the first string which
follows each "%-" in the file. The field str_flags has the bit
str_random set if the -r flag was specified, or str_ordered if the -o
flag was specified.
unstr undoes the work of strfile. It serves primarily as an emergency
backup in case you accidentally delete your source file but still have
your data file. unstr reads a data file and creates a corresponding
output file of raw strings and delimiters.
You can invoke unstr with the name of the data file, the name of the
output file, or both. If you specify both, unstr treats them literally
as the input and output files. If you provide a single argument ending
in ".dat," unstr assumes this to be the data file, and writes its output
to that filename stripped of the ".dat" suffix. If the single argument
doesn't end in ".dat," the program treats this as the name of the output
file, and consequently reads its input from outputfile.dat.
If you want a character other than "%" as your delimiter, use the -c
option to change it.
unstr normally prints out the strings in the order they occur in the data
file. If you give it the -o option, it will write them out in the seek
pointer order, which is different if the file was randomized or
alphabetized when created. Using this option, you can created sorted
versions of your input file by using strfile -o, and then using unstr -o
to dump them out in the table order.
OPTIONS
Following are the options for strfile. Only the -o and -c options may be
used with unstr.
- Print a usage summary.
-cx Use character x as the delimiter instead of %.
-s Run silently; do not summarize processing at the end of the
run.
-v Use verbose mode; summarize processing at the end of the run
(default).
-o Order the strings alphabetically. strfile stores the strings
in the same order in the data file as in the source, but the
seek pointer table will be sorted in alphabetical order of the
strings pointed to. Any initial non-alphanumeric characters
are ignored. This sets the str_ordered bit in the str_flags
field of the header.
-i Ignore case when ordering.
-r Randomize the order of the seek pointers in the table. The
strings will be stored in the same order in datafile as in
sourcefile, but the seek pointer table will be randomized.
This sets the str_random bit in the str_flags field of the
header.
EXAMPLES
To convert a file called scene which consists of lines like
%%
Hofstadter's Law:
It always takes longer than you expect, even when you take
Hofstadter's Law into account.
%%
"It is bad luck to be superstitious."
-- Andrew W. Mathis
%%
If A = B and B = C, then A = C, except where void or prohibited by law.
-- Roy Santoro
use the following command:
% strfile scene
"scene" converted to "scene.dat"
There were 1168 strings
Longest string: 1156 bytes
Shortest string: 0 bytes
FILES
strfile.h header file
SEE ALSO
fortune(6)