-

Changes

Jump to: navigation, search

Linker Reference Manual

13,610 bytes added, 16:02, 26 July 2017
Created page with " HI-TECH C incorporates a relocating assembler and linker to permit separate compilation of C source files. This means that a program may be divided into several source files,..."

HI-TECH C incorporates a relocating assembler and
linker to permit separate compilation of C source files.
This means that a program may be divided into several source
files, each of which may be kept to a manageable size for
ease of editing and compilation, then each object file com-
piled separately and finally all the object files linked
together into a single executable program.

The assembler is described in the machine-specific
manual. This appendix describes the theory behind and the
usage of the linker.

== Relocation and Psects ==

The fundamental task of the linker is to combine
several relocatable object files into one. The object files
are said to be relocatable since the files have sufficient
information in them so that any references to program or
data addresses (e.g. the address of a function) within the
file may be adjusted according to where the file is ulti-
mately located in memory after the linkage process. Thus the
file is said to be relocatable. Relocation may take two
basic forms; relocation by name, i.e. relocation by the
ultimate value of a global symbol, or relocation by psect,
i.e. relocation by the base address of a particular section
of code, for example the section of code containing the
actual excutable instructions.

=== Program Sections ===

Any object file may contain bytes to be stored in
memory in one or more program sections, which will be
referred to as psects. These psects represent logical group-
ings of certain types of code bytes in the program. The
section of the program containing executable instructions is
normally referred to as the text psect. Other sections are
the initialized data psect, called simply the data psect,
and the uninitialized data psect, called the bss psect.

In fact the linker will handle any number of psects,
and in fact more may be used in special applications. How-
ever the C compiler uses only the three mentioned, and the
names text, data and bss are simply chosen for identifica-
tion; the linker assigns no special significance to the name
of a psect.

The difference between the data and bss psects may be
exemplified by considering two external variables; one is
initialized to the value 1, and the other is not initial-
ized. The first will be placed into the data psect, and the
second in the bss psect. The bss psect is always cleared to
zeros on startup of the program, thus the second variable
will be initialized at run time to zero. The first will how-
ever occupy space in the program file, and will maintain its
initialized value of 1 at startup. It is quite possible to
modify the value of a variable in the data psect during exe-
cution, however it is better practice not to do so, since
this leads to more consistent use of variables, and allows
for restartable and romable programs.

The text psect is the section into which all executable
instructions are placed. On CP/M-80 the text psect will nor-
mally start at the base of the TPA, which is where execution
commences. The data psect will normally follow the text
psect, and the bss will be last. The bss does not occupy
space in the program (.COM) file. This ordering of psects
may be overridden by an option to the linker. This is espe-
cially useful when producing code for special hardware.

For MS-DOS and CP/M-86 the psects are ordered in the
same way, but since the 8086 processor has segment registers
providing relocation, both the text and data psects start at
0, even though they will be loaded one after the other in
memory. This allows 64k code and 64k data and stack. Suffi-
cient information is placed in the executable file (.EXE or
.CMD) for the operating system to load the program in
memory.

=== Local Psects and the Large Model ===

Since for practical purposes the psects are limited to
64K on the 8086, to allow more than 64K code the compiler
makes use of local psects. A psect is considered local if
the .psect directive has a LOCAL flag. Any number of local
psects may be linked from different modules without being
combined even if they have the same name. Note however that
no local psect may have the same name as a global psect.

All references to a local psect within the same module
(or within the same library) will be treated as references
to the same psect. Between modules however two local psects
of the same name are treated as distinct. In order to allow
collective referencing of local psects via the -P option
(described later) a local psect may have a class name asso-
ciated with it. This is achieved witht the _�C_�L_�A_�S_�S flag on the
.psect directive.

== Global Symbols ==

The linker handles only symbols which have been
declared as global to the assembler. From the C source
level, this means all names which have storage class exter-
nal and which are not declared as static. These symbols may
be referred to by modules other than the one in which they
are defined. It is the linker's job to match up the defini-
tion of a global symbol with the references to it.

== Operation ==

A command to the linker takes the following form:

LINK options files ...


Options is zero or more linker options, each of which
modifies the behaviour of the linker in some way. Files is
one or more object files, and zero or more library names.
The options recognized by the linker are as follows: they
will be recognized in upper or lower case.

-R Leave the output relocatable.

-L Retain absolute relocation info. -LM will retain only
segement relocation information.

-I Ignore undefined symbols.

-N Sort symbols by address.

-Caddr
Produce a binary output file offset by addr.

-S Strip symbol information from the output file.

-X Suppress local symbols in the output file.

-Z Suppress trivial (compiler-generated) symbols in the
output file.

-Oname
Call the output file name.

-Pspec
Spec is a psect location specification.

-Mname
Write a link map to the file name.

-Usymbol
Make symbol initially undefined.

-Dfile
Write a symbol file.

-Wwidth
Specify map width.

Taking each of these in turn:

The -R option will instruct the linker to leave the
output file (as named by a -O option, or l.obj by default)
relocatable. This is normally because there are further
files to be linked in, and the output of this link will be
used as input to the linker subsequently. Without this
option, the linker will make the output file absolute, that
is with all relocatable addresses made into absolute refer-
ences. This option may not be used with the -L or -C
options.

The -L option will cause the linker to output null
relocation information even though the file will be abso-
lute. This information allows self-relocating programs to
know what addresses must be relocated at run time. This
option is not usable with the -C option. In order to create
an executable file (i.e. a .COM file) the program objtohex
must be used. If a -LM option is used, only segment reloca-
tion information will be retained. This is used in conjuc-
tion with the large memory model. Objtohex will use the
relocation information (when invoked with a -L flag) to
insert segment relocation addresses into the executable
file.

The -I option is used when it is desired to link code
which contains symbols which are not defined in any module.
This is normally only used during top-down program develop-
ment, when routines are referenced in code written before
the routines themselves have been coded.

When obtaining a link map via the -M option, the symbol
table is by default sorted in order of symbol name. To sort
in order of address, the -N option may be used.

The output of the linker is by default an object file.
To create an executable program, this must be converted into
an executable image. For CP/M this is a .COM file, which is
simply an image of the executable program as it should
appear in memory, starting at location 100H. The linker will
produce such a file with the -C100H option. File formats for
other applications requiring an image binary file may also
be produced with the -C option. The address following the
-C may be given in decimal (default), octal (by using o or O
suffix) or hexadecimal (by using an h or H suffix).

Note that because of the complexity of the executable
file formats for MS-DOS and CP/M-86, LINK will not produce
these (.EXE and .CMD resp.) formats directly. The compiler
automatically runs OBJTOHEX with appropriate options to gen-
erate the correct file format.

The -S, -X and -Z options, which are meaningless when
the -C option is used, will strip respectively all symbols,
all local symbols or all trivial local symbols from the out-
put file. Trivial symbols are symbols produced by the com-
piler, and have the form of one of a set of alphabetic char-
acters followed by a digit string.

The default output file name is _�l._�o_�b_�j, or _�l._�b_�i_�n when
the -C option is used. This may be overridden by the -O_�n_�a_�m_�e
option. The output file will be called _�n_�a_�m_�e in this
instance. Note that no suffix is appended to the name; the
file will be called exactly the argument to the option.

For certain specialized applications, e.g. producing
code for an embedded microprocessor, it is necessary to
specify to the linker at what address the various psects
should be located. This is accomplished with the -P option.
It is followed by a specification consisting of a comma-
separated list of psect names, each with an optional address
specification. In the absence of an address specification
for a psect listed, it will be concatenated with the previ-
ous psect. For example

-Ptext=0c000h,data,bss=8000h


This will cause the text psect to be located at 0C000H,
the data psect to start at the end of the text psect, and
the bss psect to start at 8000H. This may be for a processor
with ROM at 0C000H and RAM at 8000H.

Where the link address, that is the address at which
the code will be addressed at execution time, and the load
address, that is the address offset within the output file,
are different (e.g for the 8086) it is possible to specify
the load address separately from the link address. For exam-
ple:

-Ptext=100h/0,data=0C000h/


This specification will cause the text segment to be
linked for execution at 100h, but loaded in the output file
at 0, while the data segment will be linked for 0C000h, but
loaded contiguously with the text psect in the file. Note
that if the slash (`/') is omitted, the load address is the
same as the link address, while if the slash is supplied,
but not followed by an address, the psect will be loaded
after the previous psect.

In order to specify link and load addresses for local
psects, the group name to which the psects belong may be
used in place of a global psect name. The local psects will
then have a link address as specified in the -P option, and
load addresses incrementing upwards from the specified load
address.

The -Mname option requests a link map, containing sym-
bol table and module load address information to be written
onto the file name. If name is omitted, the map will be
written to standard output. -W may be used to specify the
desired width of the map.

The -U option allows the specification to the linker of
a symbol which is to be initially entered into the symbol
table as undefined. This is useful when loading entirely
from libraries. More than one -U flag may be used.

If it is desired to use the debugger on the program
being linked, it is useful to produce a symbol file. The
-D_�f_�i_�l_�e option will write such a symbol file onto the named
_�f_�i_�l_�e, or _�l._�s_�y_�m if no file is given. The symbol file consists
of a list of addresses and symbols, one per line.

== Examples ==

Here are some examples of using the linker. Note how-
ever that in the normal case it is not necessary to invoke
the linker explicitly, since it is invoked automatically by
the C command.

LINK -MMAP -C100H START.OBJ MAIN.OBJ A:LIBC.LIB


This command links the files start.obj and main.obj
with the library a:libc.lib. Only those modules that are
required from the library will be in fact linked in. The
output is to be in .COM format, placed in the default file
l.bin. A map is to be written to the file of the name map.
Note that the file start.obj should contain startup code,
and in fact the lowest address code in that file will be
executed when the program is run, since it will be at 100H.

LINK -X -R -OX.OBJ FILE1.OBJ FILE2.OBJ A:LIBC.LIB


The files file1.obj and file2.obj will be linked with
any necessary routines from a:libc.lib and left in the file
x.obj. This file will remain relocatable. Undefined symbols
will not cause an error. The file x.obj will probably later
be the object of another link invocation. All local symbols
will be stripped from the output file, thus saving space.

== Invoking the Linker ==

The linker is called LINK, and normally resides on the
A: drive, under CP/M, or in the directory A:\HITECH\ under
MS-DOS. It may be invoked with no arguments, in which case
it will prompt for input from standard input. If the stan-
dard input is a file, no prompts will be printed. The input
supplied in this manner may contain lower case, whereas CP/M
converts the entire command line to upper case by default.
This is useful with the -U and -P options. This manner of
invocation is generally useful if the number of arguments to
LINK is large. Even if the list of files is too long to fit
on one line, continuation lines may be included by leaving a
backslash ('\') at the end of the preceding line. In this
fashion, LINK commands of almost unlimited length may be
issued.

Navigation menu