tar(4) — File Formats
OSF
NAME
tar − Tape archive file format
DESCRIPTION
The tar command dumps several files into one, in a medium suitable for transportation.
A tar tape or tar file is a series of blocks, with each block of size TBLOCK. A file on the tape is represented by a header block which describes the file, followed by zero or more blocks which give the contents of the file. At the end of the tape are two blocks filled with binary zeros, as an end-of-file indicator.
The blocks are grouped for physical I/O operations. Each group of n blocks (where n is set by the b keyletter on the tar command line, with a default of 20 blocks) is written with a single system call. On nine-track tapes, the result of this write is a single tape record. The last group is always written at the full size, so blocks after the two zero blocks contain random data. On reading, the specified or default group size is used for the first read, but if that read returns less than a full tape block, the reduced block size is used for further reads.
The header block looks like:
#define TBLOCK512
#define NAMSIZ100 union hblock {
char dummy[TBLOCK];
struct header {
char name[NAMSIZ];
char mode[8];
char uid[8];
char gid[8];
char size[12];
char mtime[12];
char chksum[8];
char linkflag;
char linkname[NAMSIZ];
} dbuf;
};
The name field is a null-terminated string. The other fields are zero-filled octal numbers in ASCII format. If the width of each field is given as w, each field contains w-2 digits, a space, and a null, with the exception of the size and mtime fields, which do not contain the trailing null, and the chksum field, which has a null followed by a space.
The name field is the name of the file, as specified on the tar command line. Files dumped because they were in a directory that was named in the command line have the directory name as prefix and /filename as suffix.
The mode field is the file mode, with the top bit masked off. The uid and gid fields are the user and group numbers that own the file. The size field is the size of the file in bytes. Links and symbolic links are dumped with this field specified as zero.
The mtime field is the modification time of the file at the time it was dumped.
The chksum field is an octal ASCII value which represents the sum of all the bytes in the header block. When calculating the checksum, the chksum field is treated as if it were all blanks.
The linkflag field is null if the file is a regular or special file, ASCII 1 if it is an hard link, and ASCII 2 if it is a symbolic link. The name that the file is linked to, if any, is in the linkname field, with a trailing null. Unused fields of the header are binary zeros (and are included in the checksum).
The first time a given i-node number is dumped, it is dumped as a regular file. Subsequently, it is dumped as a link instead. Upon retrieval, if a link entry is retrieved but the file it was linked to is not, an error message is printed and the tape must be manually rescanned to retrieve the file that it is linked to.
The encoding of the header is designed to be portable across machines.
RELATED INFORMATION
Commands: tar(1)