Museum

Home

Lab Overview

Retrotechnology Articles

⇒ Online Manual

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

grep(1)

paste(1)

environ(5)

cut(1)

NAME

cut − cut out selected fields of each line of a file

SYNOPSIS

cut −b list [ −n ] [ filename ... ]
cut −c list [ filename ... ]
cut −f list [ −d delim ] [ −s ] [ filename ]

AVAILABILITY

SUNWcsu

DESCRIPTION

Use cut to cut out columns from a table or fields from each line of a file; in data base parlance, it implements the projection of a relation.  The fields as specified by list can be fixed length, that is, character positions as on a punched card (−c option) or the length can vary from line to line and be marked with a field delimiter character like tab (−f option).  cut can be used as a filter; if no files are given, the standard input is used.  In addition, a file name of “−” explicitly refers to standard input. 

Either the −b, −c, or −f option must be specified. 

Use grep(1) to make horizontal “cuts” (by context) through a file, or paste(1) to put files together column-wise (that is, horizontally).  To reorder columns in a table, use cut and paste. 

OPTIONS

list A comma-separated list of integer field numbers (in increasing order), with optional − to indicate ranges (for instance, 1,4,7; 1−3,8; −5,10 (short for 1−5,10); or 3− (short for third through last field)). 

−blist The list following −b specifies byte positions (for instance, −b1−72 would pass the first 72 bytes of each line).  When −b and −n are used together, list is adjusted so that no multi-byte character is split.  If −b is used, the input line should contain 1023 bytes or less. 

−clist The list following −c specifies character positions (for instance, −c1−72 would pass the first 72 characters of each line). 

−ddelim The character following −d is the field delimiter (−f option only).  Default is tab. Space or other characters with special meaning to the shell must be quoted. delim can be a multi-byte character. 

−flist The list following −f is a list of fields assumed to be separated in the file by a delimiter character (see −d ); for instance, −f1,7 copies the first and seventh field only.  Lines with no field delimiters will be passed through intact (useful for table subheadings), unless −s is specified.  If −f is used, the input line should contain 1023 characters or less. 

−n Do not split characters.  When −b list and −n are used together, list is adjusted so that no multi-byte character is split. 

−s Suppresses lines with no delimiter characters in case of −f option.  Unless specified, lines with no delimiters will be passed through untouched. 

EXAMPLES

A mapping of user IDs to names follows:

example% cut −d: −f1,5 /etc/passwd

To set name to current login name follows:

example% name=`who am i | cut −f1 −d `

ENVIRONMENT

If any of the LC_∗ variables ( LC_CTYPE, LC_MESSAGES, LC_TIME, LC_COLLATE, LC_NUMERIC, and LC_MONETARY ) (see environ(5)) are not set in the environment, the operational behavior of cut for each corresponding locale category is determined by the value of the LANG environment variable.  If LC_ALL is set, its contents are used to override both the LANG and the other LC_∗ variables.  If none of the above variables is set in the environment, the "C"  (U.S. style) locale determines how cut behaves. 

LC_CTYPE
Determines how cut handles characters. When LC_CTYPE is set to a valid value, cut can display and handle text and filenames containing valid characters for that locale.  cut can display and handle Extended Unix Code (EUC) characters where any individual character can be 1, 2, or 3 bytes wide.  cut can also handle EUC characters of 1, 2, or more column widths. In the "C" locale, only characters from ISO 8859-1 are valid. 

LC_MESSAGES
Determines how diagnostic and informative messages are presented. This includes the language and style of the messages, and the correct form of affirmative and negative responses.  In the "C" locale, the messages are presented in the default form found in the program itself (in most cases, U.S. English).

SEE ALSO

grep(1), paste(1), environ(5)

DIAGNOSTICS

"ERROR:  line too long"
A line can have no more than 1023 characters or fields, or there is no new-line character.

"ERROR:  bad list for c/f option"
Missing −c or −f option or incorrectly specified list. No error occurs if a line has fewer fields than the list calls for. 

"ERROR:  no fields"
The list is empty. 

"ERROR:  no delimeter"
Missing char on −d option. 

"ERROR:  cannot handle multiple adjacent backspaces"
Adjacent backspaces cannot be processed correctly.

"WARNING:  cannot open <filename>"
Either filename cannot be read or does not exist.  If multiple filenames are present, processing continues. 

SunOS 5.1/SPARC  —  Last change: 14 Sep 1992

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026