V++: An Instruction-Restructurable Processor Architecture

T. Arita, H. Takagi and M. Sowa

Proceedings of the Twenty-Seventh Annual Hawaii International Conference on System Sciences, Vol. I, pp.398-407, 1994
abstract
Performance of microprocessors has been increased dramatically by reducing the cycle time and decreasing the average cycle number required to execute an instruction. It is essential to extract fine grain parallelism for further increase of processor performance. The advantage of VLIW (Very Long Instruction Words) processors is that the VLIW hardware doesn't need to check precedence relations or issue restrictions because the compiler takes the responsibility for finding the operations that can be issued together, and creating a single instruction containing these operations. This paper investigate an extension model of VLIW architecture called V++, which retains the capabilities of VLIW architecture to effectively exploit fine grain parallelism while introducing facilities for restructuring very long instruction words dynamically. V++ adopts two types of restructuring methods: one is predetermined restructuring, which is realized by delaying certain operations on the basis of the information generated by the compiler, and the other is adaptive restructuring, which is controlled by the high-speed synchronization mechanism called Ultimate barrier. Therefore, unlike conventional VLIW architecture, V++ can remarkably reduce code size which has a tendency to be increased considerably on VLIW machines and becomes robust against dynamic variation in operation latency. V++ can be regarded as an extended model of VLIW architecture while maintaining the capabilities of the conventional VLIW processors: hardware simplicity owing to organization of single VLIW stream and low execution-time overhead owing to optimizing compilers. This paper describes the principle of instruction restructuring and sketching the processor architecture implementing instruction restructuring facilities. This paper also illustrates a design of the compiler for V++, describes barrier scheduling, and presents the preliminary analysis of the effects produced by instruction restructuring.