A.OUT(5) — FILE FORMATS
NAME
a.out − assembler and link editor output format
SYNOPSIS
#include <a.out.h> #include <stab.h> #include <nlist.h>
DESCRIPTION
A.out is the output format of the assembler as(1) and the link editor ld(1). The link editor makes a.out executable files if there were no errors and no unresolved external references. Layout information as given in the include file for the Sun system is:
/∗
∗ Header prepended to each a.out file.
∗/
struct exec {
unsignedshorta_machtype;/∗ machine type ∗/
unsignedshorta_magic;/∗ magic number ∗/
unsigneda_text;/∗ size of text segment ∗/
unsigneda_data;/∗ size of initialized data ∗/
unsigneda_bss;/∗ size of uninitialized data ∗/
unsigneda_syms;/∗ size of symbol table ∗/
unsigneda_entry;/∗ entry point ∗/
unsigneda_trsize;/∗ size of text relocation ∗/
unsigneda_drsize;/∗ size of data relocation ∗/
};
#defineM_680101/∗ runs on either MC68010 or MC68020 ∗/
#defineM_680202/∗ runs only on MC68020 ∗/
#defineOMAGIC0407/∗ magic number for old impure format ∗/
#defineNMAGIC0410/∗ magic number for read-only text ∗/
#defineZMAGIC0413/∗ magic number for demand load format ∗/
#definePAGSIZ0x2000/∗ page size - same for Sun-2 and Sun-3 ∗/
#defineSEGSIZ0x20000/∗ segment size - same for Sun-2 and Sun-3 ∗/
/∗
∗ The following macros take exec structures as arguments. N_BADMAG(x) returns
∗ 0 if the file has a reasonable magic number.
∗/
#defineN_BADMAG(x) \
(((x).a_magic)!=OMAGIC && ((x).a_magic)!=NMAGIC && ((x).a_magic)!=ZMAGIC)
/∗
∗ Offsets to text|symbols|strings.
∗/
#defineN_TXTOFF(x) \
((x).a_magic==ZMAGIC ? 0 : sizeof (struct exec))
#define N_SYMOFF(x) \
(N_TXTOFF(x) + (x).a_text+(x).a_data + (x).a_trsize+(x).a_drsize)
#defineN_STROFF(x) \
(N_SYMOFF(x) + (x).a_syms)
/∗
∗ Macros which take exec structures as arguments and tell where the
∗ various pieces will be loaded.
∗/
#defineN_TXTADDR(x) PAGSIZ
#defineN_DATADDR(x) \
(((x).a_magic==OMAGIC)? (N_TXTADDR(x)+(x).a_text) \
: (SEGSIZ+((N_TXTADDR(x)+(x).a_text-1) & ~(SEGSIZ-1))))
#define N_BSSADDR(x) (N_DATADDR(x)+(x).a_data)
The a.out file has five sections: a header, the program text and data, relocation information, a symbol table and a string table (in that order). In the header the sizes of each section are given in bytes. The last three sections may be absent if the program was loaded with the ‘−s’ option of ld or if the symbols and relocation have been removed by strip(1).
The machine type in the header indicates the type of hardware on which the object code may be executed. Sun-2 code may be executed on Sun-3 systems, but not vice versa. Program files predating release 3.0 are recognized by a machine type of 0.
If the magic number in the header is OMAGIC (0407), it means that this is a non-sharable text which is not to be write-protected, so the data segment is immediately contiguous with the text segment. This is rarely used. If the magic number is NMAGIC (0410) or ZMAGIC (0413), the data segment begins at the first segment boundary following the text segment, and the text segment is not writable by the program; other processes executing the same file will share the text segment. For ZMAGIC format, the text and data sizes must both be multiples of the page size, and the pages of the file will be brought into the running image as needed, and not pre-loaded as with the other formats. This is suitable for large programs and is the default format produced by ld(1). The macros N_TXTADDR, N_DATADDR, and N_BSSADDR give the memory addresses at which the text, data, and bss segments, respectively, will be loaded.
In the ZMAGIC format, the size of the header is included in the size of the text section; in other formats, it is not.
When an a.out file is executed, three logical segments are set up: the text segment, the data segment (with uninitialized data, which starts off as all 0, following initialized data), and a stack. For the ZMAGIC format, the header is loaded with the text segment; for other formats it is not.
Program execution begins at the address given by the value of the a-entry field. In all file types other than XMAGIC, the is the same as N_TXTADDR(x). In ZMAGIC files it is N_TXTADDR + sizeof(struct exec).
The stack starts at the highest possible location in the memory image, and grows downwards. The stack is automatically extended as required. The data segment is extended as requested by brk(2) or sbrk(2).
After the header in the file follow the text, data, text relocation data relocation, symbol table and string table in that order. The text begins at the beginning of the file for ZMAGIC format or just after the header for the other formats. The N_TXTOFF macro returns this absolute file position when given the name of an exec structure as argument. The data segment is contiguous with the text and immediately followed by the text relocation and then the data relocation information. The symbol table follows all this; its position is computed by the N_SYMOFF macro. Finally, the string table immediately follows the symbol table at a position which can be gotten easily using N_STROFF. The first 4 bytes of the string table are not used for string storage, but rather contain the size of the string table; this size includes the 4 bytes; thus, the minimum string table size is 4.
RELOCATION
The value of a byte in the text or data which is not a portion of a reference to an undefined external symbol is exactly that value which will appear in memory when the file is executed. If a byte in the text or data involves a reference to an undefined external symbol, as indicated by the relocation information, then the value stored in the file is an offset from the associated external symbol. When the file is processed by the link editor and the external symbol becomes defined, the value of the symbol is added to the bytes in the file.
If relocation information is present, it amounts to eight bytes per relocatable datum as in the following structure:
/∗
∗ Format of a relocation datum.
∗/
struct relocation_info {
intr_address;/∗ address which is relocated ∗/
unsignedr_symbolnum:24,/∗ local symbol ordinal ∗/
r_pcrel:1, /∗ was relocated pc relative already ∗/
r_length:2,/∗ 0=byte, 1=word, 2=long ∗/
r_extern:1,/∗ does not include value of sym referenced ∗/
:4;/∗ nothing, yet ∗/
};
There is no relocation information if a_trsize+a_drsize==0. If r_extern is 0, then r_symbolnum is actually a n_type for the relocation (that is, N_TEXT meaning relative to segment text origin.)
SYMBOL TABLE
The layout of a symbol table entry and the principal flag values that distinguish symbol types are given in the include file as follows:
/∗
∗ Format of a symbol table entry.
∗/
struct nlist {
union {
char∗n_name;/∗ for use when in-memory ∗/
longn_strx;/∗ index into file string table ∗/
} n_un;
unsigned charn_type;/∗ type flag, that is, N_TEXT etc; see below ∗/
charn_other;
shortn_desc;/∗ see <stab.h> ∗/
unsignedn_value;/∗ value of this symbol (or adb offset) ∗/
};
#definen_hashn_desc/∗ used internally by ld ∗/
/∗
∗ Simple values for n_type.
∗/
#defineN_UNDF0x0/∗ undefined ∗/
#defineN_ABS0x2/∗ absolute ∗/
#defineN_TEXT0x4/∗ text ∗/
#defineN_DATA0x6/∗ data ∗/
#defineN_BSS0x8/∗ bss ∗/
#defineN_COMM0x12/∗ common (internal to ld) ∗/
#defineN_FN0x1f/∗ file name symbol ∗/
#defineN_EXT01/∗ external bit, or’ed in ∗/
#defineN_TYPE0x1e/∗ mask for all the type bits ∗/
/∗
∗ Other permanent symbol table entries have some of the N_STAB bits set.
∗ These are given in <stab.h>
∗/
#defineN_STAB0xe0/∗ if any of these bits set, don’t discard ∗/
In the a.out file a symbol’s n_un.n_strx field gives an index into the string table. A n_strx value of 0 indicates that no name is associated with a particular symbol table entry. The field n_un.n_name can be used to refer to the symbol name only if the program sets this up using n_strx and appropriate data from the string table. Because of the union in the nlist declaration, it is impossible in C to statically initialize such a structure. If this must be done (as when using nlist(3)) the file <nlist.h> should be included, rather that <a.out.h>; this contains the declaration without the union.
If a symbol’s type is undefined external, and the value field is non-zero, the symbol is interpreted by the loader ld as the name of a common region whose size is indicated by the value of the symbol.
SEE ALSO
adb(1), as(1), cc(1V), dbx(1), ld(1), nm(1), pc(1), strip(1)
Sun Release 3.2 — Last change: 29 April 1986