Compiler Implementation Notes
This page contains implementation notes that may be of interest to compiler buffs who have downloaded the source code to play with. If you're just using the compiler to compile and run your own IMP programs, you can probably skip this, and just read the User Guide page instead.
The Intel IMP compiler is based on Peter Robertson's IMP-77 compiler, which uses three passes.
The first two passes are written in IMP, and were bootstrapped using a simplified IMP compiler (SKIMP) written in C. The third pass is written in C since it needs to be portable between different targets, all of which have C compilers, but not IMP compilers.
For the 16 bit compiler, the "small" memory model is used - that is, there is a single code segment of up to 64 kilobytes, and a single data segment, also of up to 64 kilobytes, containing all the logical data segments and the stack. This has the important advantage that all pointers are 16 bits, the same as the native word size. For the 32 bit compiler the "flat" memory model is used - that is, a single 4 Gbyte address space contains the code, data and stack areas. Again all pointers are the same size as native words.
The compiler uses logical segments within the physical address space so that the linker can keep related items together, and so that (where available) appropriate memory protection can be applied.
The logical segments are:
There is also some vestigial provision in the compiler for two further logical segments which are not used by the current implementation - BSS, for uninitialised static data and DISPLAY, which was used in the initial 8086 SKIMP compiler to hold the virtual display registers. Unititialised variables are actually initialised by the current Pass 2 to an unassigned pattern, and the display is held in the local stack frame.
The stack segment is allocated at run-time by the operating system. In UNIX systems, using the 32 bit flat memory model, the stack pointer is set to the top of the user address space and the stack segment is grown automatically by the operating system in response to page faults as the stack pointer moves down the store. Unfortunately both DOS and Windows do not offer an automatic stack segment, and instead a fixed size must be requested at link time - a source of some frustration, since over-running the fixed size leads to a memory fault and abrupt termination.
Adding run-time stack probes so that violation of the requested size can be detected is on my to-do list. It will not stop the program failing, but might give a better error message!
The biggest difference between this Intel compiler and a classic Robertson compiler is that this compiler defers very little. That is, when it receives an iCode directive it goes ahead and plants code - in a Robertson compiler Pass 2 collects pending operators on the Pass 2 operand stack and thus defers planting code until it knows what is coming next. This makes the Intel compiler much simpler, both in stack data structure and in internal operation. The primary disadvantage in working this way is that a Robertson compiler generally knows the destination for a CPU operation before planting the code, whereas the Intel compiler can get caught out, typically by having an intermediate result in the "wrong" register.
Partly mitigating against this, the compiler has a simple look-ahead to pick up obvious improvements (such as a = a + b - most math instructions in the Intel architecture can act directly on a memory location) and in practice the code generated has surprisingly few wasted MOV instructions to fix an earlier poor decision.
The Intel architecture has 8 registers - AX, BX, CX, DX, SI, DI, BP and SP. (In the 32 bit implementation, Intel call these EAX, EBX and so on to avoid ambiguity. In the compiler, there is no ambiguity, since the 32 bit compiler never uses the 16 bit form of the register set, and in the source code they are still referred to as AX, BX, etc. for commonality with the 16 bit compiler). Of these 8 registers, BP and SP have fixed roles (local name space base pointer, and stack pointer respectively) and are not available as general registers, leaving 6 general registers.
One of the key differences between the 16 bit and 32 bit architectures is in the combinations of registers that may be used in addressing memory. In the 16 bit architecture, only BX, SI and DI can be used as pointers to memory. The compiler therefore distinguishes between register allocations that will be used as pointers, and register allocations that will be used for math operations, and selects from the appropriate set accordingly. This incidentally leads to the classic "wrong register" problem identified earlier, where the array access A(X+Y+Z) will generally evaluate X+Y+Z in an accumulator (AX) and only then discover that it needs to move the result to a pointer register to complete the array access.
Although not strictly necessary, the 32 bit compiler maintains the same pointer register separation for compatibility, but unlike the 16 bit compiler it knows that it can use any register as a memory pointer - in the example above it will happily use the result in AX to index into the array.
Primitive support routines
IMP relies on support subroutines for many of the more complex operations, such as array access, string copy and concatenation, and certain math operations not supported by the CPU directly. In this implementation all the support routines are written in C for greater portability and maintainability. Although this means that all calls to primitive routines must use "proper" C-style stack based parameter passing rather than dedicated registers, performance seems more than adequate.
One of the language features that took some thought was the event mechanism. When a procedure invokes a signal, the primitive called must walk the stack to see if any procedure (including this one) is interested in catching this event. I was very reluctant to add any overhead in the procedure entry and exit code to maintain a stack of event handlers, and therefore the run-time library needs to search for them when an event is signalled.
By tracing back the stack, the event mechanism finds a list of program counter values of all the calling locations. It needs to map these into procedures, and then determine whether each procedure has a handler for this event type, and if so, what the entry point to the handler is. The solution is to include in the run-time data area a table of structures containing:
A word about bootstrapping
There have been many successful IMP compilers written over the years, and when embarking on this project a port of an existing compiler seemed the obvious way forward. Unfortunately there is a snag with this strategy - most IMP compilers were written in IMP. The original environment in which these compilers were developed was rich in IMP tools, making that a reasonable choice, but today none of these tools are readily accessible.
Nevertheless, the motivation to produce a "historically accurate" IMP compiler was strong, so some lateral thinking was required. The solution was to write a cheap and cheerful compiler for a subset of the IMP language which could be used to bootstrap a full IMP compiler. This simple compiler was based on the SKIMP compiler by D. J. Rees, but rewritten in C.
Why Skimp? I wanted something that would be quick to implement, and that would not require too much original thought. I still have the Skimp manual from my university days, which gives a pretty thorough description of how to construct a simple compiler, and more importantly I have listings of my own student work on Skimp. Of course Skimp is written in IMP, and the whole point of the exercise is that I don't have an IMP compiler, so I have re-implemented Skimp in C - hence "C Skimp".
The next step in the strategy has been to take that SKIMP compiler (which produced 16 bit 8086 code) and use it to develop a code generator (pass 2) for an IMP-77 compiler. Because the IMP-77 compiler was being developed in the 16 bit SKIMP environment, it made sense to re-use the same tools and libraries and target the IMP-77 compiler at the same 16 bit environment.
The final stage was to take that 16 bit Intel compiler and retarget it to the 32 bit architecture. From a code generation point of view that was relatively simple; the challenge was in producing object files compatible with a quite different set of tools.
There were also a few tricks needed to hand-craft intermediate versions of the compiler that would run on a 16 bit machine but output 32 bit literals, needed for the actual bootstrap of up-compiling the compiler.
The compiler is working in both 16 bit and 32 bit versions. The object file support for 16 bit OMF and 32 bit COFF is complete, and the ELF object file support is substantially complete.
There are still some areas that need to be tidied up. These include: