Specific Features

From HI-TECH C for CP/M Fan WIKI(EN)
Jump to: navigation, search

The HI-TECH C compiler has a number of features, which while largely compatible with other C compilers, contribute to more reliable programming methods.

ANSI C Standard Compatibility

At the time of writing the Draft ANSI Standard for the C Language was at an advanced stage, though not yet an official standard. Accordingly it is not possible to claim compliance with that standard, however HI-TECH C includes the majority of the new and altered features in the draft ANSI standard. Thus it is in the sense that most people understand it "ANSI compatible".

Type Checking

Previous C compilers have adopted a lax approach to type checking. This is typified by the Unix C compiler, which allows almost arbritary mixing of types in expressions. The HI-TECH C compiler performs much more strict type checking, although in most cases only warning messages are issued, allowing compilation to proceed if the user knows that the errors are harmless. This would occur, for example, when an integer value was assigned to a pointer variable. The generated code would almost certainly be what the user intended, however if in fact it represented an error in the source code, the user is prompted to check and correct it where necessary.

Member Names

In early C compilers member names in different structures were required to be distinct except under certain circumstances. HI-TECH C, like most recent implementations of C, allows member names in different structures and unions to overlap. A member name is recognized only in the context of an expression whose type is that of the structure in which the member is defined. In practice this means that a member name will be recognized only to the right of a '.' or '->'operator, where the expression to the left of the operator is of type structure or pointer to structure the same as that in which the member name was declared. This not only allows structure names to be re-used without conflict in more than one structure, it permits strict checking of the usage of members; a common error with other C compilers is the use of a member name with a structure pointer of the wrong type, or worse with a variable which is a pointer to a simple type.

There is however an escape from this, where the user desires to use as a structure pointer something which is not declared as such. This is via the use of a typecast. For example, suppose it is desired to access a memory-mapped i/o device, consisting of several registers. The declarations and use may look something like the code fragment in fig. 2.

struct io_dev
short   io_status;    /* status */
char    io_rxdata;    /* rx data */
char    io_txdata;    /* tx data */

#define RXRDY   01           /* rx ready */
#define TXRDY   02           /* tx ready */

/* define the (absolute) device address */

#define DEVICE  ((struct io_dev *)0xFF00)

char c;
/* wait till transmitter ready */
while(!(DEVICE->io_status & TXRDY))
/* send the data byte */
DEVICE->io_txdata = c;
Fig. 2. Use of Typecast on an Absolute Address

In this example, the device in question has a 16 bit status port, and two 8 bit data ports. The address of the device (i.e. the address of its status port) is given as (hex)0FF00. This address is typecast to the required struc- ture pointer type to enable use of the structure member names. The code generated by this will use absolute memory references to access the device, as required.

Some examples of right and wrong usage of member names are shown in fig. 3.

Unsigned Types

HI-TECH C implements unsigned versions of all integral types; i.e. unsigned char, short, int and long. If an unsigned quantity is shifted right, the shift will be performed as a logical shift, i.e. bringing zeros into the rightmost bits. Similarly right shifts of a signed quantity will sign extend the rightmost bits.

Arithmetic Operations

On machines where arithmetic operations may be performed more efficiently in lengths shorter than int, operands shorter than int will not be extended to int length unless necessary.

For example, if two characters are added and the result stored into another character, it is only necessary to

struct fred
char      a;
int       b;
}     s1, * s2;

struct bill
float   c;
long    b;
}       x1, * x2;

/* wrong - c is not a member of fred */
s1.c = 2;

/* correct */
s1.a = 2;

/* wrong - s2 is a pointer */
s2.a = 2;

/* correct */
x2->b = 24L;

/* right, but note type conversion
from long to int */
s2->b = x2->b;
Fig. 3. Examples of Member Usage

perform arithmetic in 8 bits, since any overflow into the top 8 bits will be lost. However, if the sum of two characters is stored into an int, the addition should be done in 16 bits to ensure the correct result.

In accordance with the draft ANSI standard, operations on float rather than double quantities will be performed in the shorter precision rather than being converted to double precision then back again.

Structure Operations

HI-TECH C implements structure assignments, structure arguments and structure-valued functions in their full generality. The example in fig. 4 is a function returning a structure. Some legal (and illegal) uses of the function are also shown.

struct bill
char    a;
int     b;
struct bill     x;

return x;

struct bill     a;

a = afunc();            /* ok */
pf("%d", afunc().a);    /* ok */

/* illegal, afunc() cannot be assigned
to, therefore neither can
afunc().a */
afunc().a = 1;

/* illegal, same reason */
Fig. 4. Example of a Function Returning a Structure

Enumerated Types

HI-TECH C supports enumerated types; these provide a structured way of defining named constants.

The uses of enumerated types are more restricted than that allowed by the Unix C compiler, yet more flexible than permitted by LINT. In particular, an expression of enumerated type may be used to dimension arrays, as an array index or as the operand of a switch statement. Arithmetic may be performed on enumerated types, and enumerated type expressions may be compared, both for equality and with the relation operators. An example of the use of an enumerated type is given in fig. 5.

Initialization Syntax

Kernighan and Ritchie in "The C Programming Language" state that pairs of braces may be omitted from an initializer in certain contexts; the draft ANSI standard provides that a conforming C program must either include all braces in an initializer, or leave them all out. HI-TECH C allows any pairs of braces to be omitted providing that the front end of the compiler can determine the size of any arrays being initialized, and providing that there is no ambiguity as to which braces have been omitted. To avoid ambiguity if any pairs of braces are present then any braces which would

/* a represents 0, b -> 1 */
enum fred { a, b, c = 4 };

enum fred       x, y, z;

x = z;
if(x < z)
x = (enum fred)3;
switch(z) {
case a:
case b:
Fig. 5. Use of an Enumerated Type

enclose those braces must also be present. The compiler will complain ("initialization syntax") if any ambiguity is present.

Function Prototypes

A new feature of C included in the proposed ANSI for C, known as "function prototypes", provides C with an argument checking facility, i.e. it allows the compiler to check at compile time that actual arguments supplied to a function invocation are consistent with the formal parameters expected by the function. The feature allows the programmer to include in a function declaration (either an external declaration or an actual definition) the types of the parameters to that function. For example, the code fragment shown in fig. 6 shows two function prototypes.

void fred(int, long, char *);

char *
bill(int a, short b, ...)
return a;
Fig. 6. Function Prototypes

The first prototype is an external declaration of the function fred(), which accepts one integer argument, one long argument, and one argument which is a pointer to char. Any usage of fred() while the prototype declaration is in scope will cause the actual parameters to be checked for number and type against the prototype, e.g. if only two arguments were supplied or an integral value was supplied for the third argument the compiler would report an error.

In the second example, the function bill() expects two or more arguments. The first and second will be converted to int and short respectively, while the remainder (if present) may be of any type. The ellipsis symbol (...) indicates to the compiler that zero or more arguments of any type may follow the other arguments. The ellipsis symbol must be last in the argument list, and may not appear as the only argument in a prototype.

All prototypes for a function must agree exactly, however it is legal for a definition of a function in the old style, i.e. with just the parameter names inside the parentheses, to follow a prototype declaration provided the number and type of the arguments agree. In this case it is essential that the function definition is in scope of the prototype declaration.

Access to unspecified arguments (i.e. arguments supplied where an ellipsis appeared in a prototype) must be via the macros defined in the header file <stdarg.h>. This defines the macros va_start, va_arg and va_end. See va_start in the library function listing for more information.

NOTE that is is a grave error to use a function which has an associated prototype unless that prototype is in scope, i.e. the prototype MUST be declared (possibly in a header file) before the function is invoked. Failure to comply with this rule may result in strange behaviour of the program. HI-TECH C will issue a warning message ("func() declared implicit int") whenever a function is called without an explicit declaration. It is good practice to declare all functions and global variables in one or more header files which are included wherever the functions are defined or referenced.

Void and Pointer to Void

The void type may be used to indicate to the compiler that a function does not return a value. Any usage of the return value from a void function will be flagged as an error.

The type void *, i.e. pointer to void, may be used as a "universal" pointer type. This is intended to assist in the writing of general purpose storage allocators and the like, where a pointer is returned which may be assigned to another variable of some other pointer type. The compiler permits without typecasting and without reporting an error the conversion of void * to any other pointer type and vice versa. The programmer is advised to use this facility carefully and ensure that any void * value is usable as a pointer to any other type, e.g. the alignment of any such pointer should be suitable for storage of any object.

Type qualifiers

The ANSI C standard introduced the concept of typequalifiers to C; these are keywords that qualify the type to which they are applied. The type qualifiers defined by ANSI C are const and volatile. HI-TECH C also implements several other type qualifiers. The extra qualifiers include:

fast interrupt

Not all versions of the compilers implement all of the extra qualifiers. See the machine dependent section for further information.

When constructing declarations using type qualifiers, it is very easy to be confused as to the exact semantics of the declaration. A couple of rules-of-thumb will make this easier. Firstly, where a type qualifier appears at the left of a declaration it may appear with any storage class specifier and the basic type in any order, e.g.

static void interrupt   func();

is semantically the same as

interrupt static void   func();

Where a qualifier appears in this context, it applies to the basic type of the declaration. Where a qualifier appears to the right of one or more '*' (star) pointer modifiers, then you should read the declaration from right to left, e.g.

char * far fred;

should be read as "fred is a far pointer to char". This means that fred is qualified by far, not the char to which it points. On the other hand,

char far * bill;

should be read as "bill is a pointer to a far char", i.e. the char to which bill points is located in the far address space. In the context of the 8086 compiler this will mean that bill is a 32 bit pointer while fred is a 16 bit pointer. You will hear bill referred to as a "far pointer",however the terminology "pointer to far" is preferred.


There are two methods provided for in-line assembler code in C programs. The first allows several lines of assembler anywhere in a program. This is via the #asm and #endasm preprocessor directives. Any lines between these two directives will be copied straight through to the assembler file produced by the compiler. Alternatively you can use the asm("string"); construct anywhere a C statement is expected. The string will be copied through to the assembler file. Care should be taken with using in-line assembler since it may interact with compiler generated code.

Pragma Directives

The draft ANSI C standard provides for a #pragma preprocessor directive that allows compiler implementations to control various aspects of the compilation process. Currently HI-TECH C only supports one pragma, the pack directive. This allows control over the manner in which members are allocated inside a structure. By default, some of the compilers (especially the 8086 and 68000 compilers) will align structure members onto even boundaries to optimize machine accesses. It is sometimes desired to override this to achieve a particular layout inside a structure. The pack pragma allows specification of a maximum packing factor. For example, #pragma pack1(1) will instruct the compiler that no additional padding be inserted between struc ture members, i.e. that all members should be aligned on boundaries divisible by 1. Similarly #pragma pack(2) will allow alignment on boundaries divisible by 2. In no case will use of the pack pragma force a greater alignment than would have been used for that data type anyway.

More that one pack pragma may be used in a program. Any use will remain in force until changed by another pack or until the end of the file. Do not use a pack pragma before include files such as <stdio.h> as this will cause incorrect declarations of run-time library data structures.