csplit(1)
NAME
csplit − split a file with respect to a given context
SYNOPSIS
csplit [ −s ] [ −k ] [ −f prefix ] filename argument1 [ ... argumentn ]
AVAILABILITY
SUNWesu
DESCRIPTION
csplit reads filename and separates it into n+1 sections, defined by the arguments argument1...argumentn. By default the sections are placed in xx00...xxn (n may not be greater than 99). These sections get the following pieces of filename:
00: From the start of filename up to (but not including) the line referenced by argument1.
01: From the line referenced by argument1 up to the line referenced by argument2.
. . .
n: From the line referenced by argumentn to the end of filename.
If the filename argument is a −, then standard input is used.
The arguments (argument1...argumentn) to csplit can be a combination of the following:
/rexp/ A file is to be created for the section from the current line up to (but not including) the line containing the regular expression rexp. The current line becomes the line containing rexp. This argument may be followed by an optional + or − some number of lines (for example, /Page/−5). See ed(1) for a description of how to specify a regular expression.
%rexp% This argument is the same as /rexp/, except that no file is created for the section.
lnno A file is to be created from the current line up to (but not including) lnno. The current line becomes lnno.
{num} Repeat argument. This argument may follow any of the above arguments. If it follows a rexp type argument, that argument is applied num more times. If it follows lnno, the file will be split every lnno lines (num times) from that point.
Enclose all rexp type arguments that contain blanks or other characters meaningful to the shell in the appropriate quotes. Regular expressions may not contain embedded newlines. csplit does not affect the original file; it is the user’s responsibility to remove it if it is no longer wanted.
OPTIONS
−s csplit normally prints the character counts for each file created. If the −s option is present, csplit suppresses the printing of all character counts.
−k csplit normally removes created files if an error occurs. If the −k option is present, csplit leaves previously created files intact.
−f prefix If the −f option is used, the created files are named prefix00 ... prefixn. The default is xx00...xxn.
EXAMPLES
This example creates four files, cobol00...cobol03.
example% csplit −f cobol filename ’/procedure division/’ /par5./ /par16./
After editing the “split” files, they can be recombined as follows:
example% cat cobol0[0−3] > filename
Note: This example overwrites the original file.
This example splits the file at every 100 lines, up to 10,000 lines. The −k option causes the created files to be retained if there are less than 10,000 lines; however, an error message would still be printed.
example% csplit −k filename 100 {99}
If prog.c follows the normal C coding convention (the last line of a routine consists only of a } in the first character position), this example creates a file for each separate C routine (up to 21) in prog.c.
example% csplit −k prog.c ’%main(%´ ’/^}/+1’ {20}
ENVIRONMENT
If any of the LC_∗ variables ( LC_CTYPE, LC_MESSAGES, LC_TIME, LC_COLLATE, LC_NUMERIC, and LC_MONETARY ) (see environ(5)) are not set in the environment, the operational behavior of csplit for each corresponding locale category is determined by the value of the LANG environment variable. If LC_ALL is set, its contents are used to override both the LANG and the other LC_∗ variables. If none of the above variables is set in the environment, the "C" (U.S. style) locale determines how csplit behaves.
LC_CTYPE
Determines how csplit handles characters. When LC_CTYPE is set to a valid value, csplit can display and handle text and filenames containing valid characters for that locale. csplit can display and handle Extended Unix Code (EUC) characters where any individual character can be 1, 2, or 3 bytes wide. csplit can also handle EUC characters of 1, 2, or more column widths. In the "C" locale, only characters from ISO 8859-1 are valid.
LC_MESSAGES
Determines how diagnostic and informative messages are presented. This includes the language and style of the messages, and the correct form of affirmative and negative responses. In the "C" locale, the messages are presented in the default form found in the program itself (in most cases, U.S./English).
SEE ALSO
DIAGNOSTICS
Self-explanatory except for:
arg − out of range
which means that the given argument did not reference a line between the current position and the end of the file.
SunOS 5.1/SPARC — Last change: 14 Sep 1992