From: anton@mips.complang.tuwien.ac.at   
      
   EricP writes:   
   >I see the difference between CISC and RISC as in the micro-architecture,   
      
   But the microarchitecture is not an architectural criterion.   
      
   >changing from a single sequential state machine view to multiple concurrent   
   >machines view, and from Clocks Per Instruction to Instructions Per Clock.   
      
   People changed from talking CPI to IPC when CPI started to go below 1.   
   That's mainly a distinction between single-issue and superscalar CPUs.   
      
   >The monolithic microcoded machine, which covers 360, 370, PDP-11, VAX,   
   >386, 486 and Pentium, is like a single threaded program which   
   >operates sequentially on a single global set of state variables.   
   >While there is some variation and fuzziness around the edges,   
   >the heart of each of these are single sequential execution engines.   
      
   The same holds true for the MIPS R2000, the ARM1/2 (and probably many   
   successors), probably early SPARCs and early HPPA CPUs, all of which   
   are considered as RISCs. Documents about them also talk about CPI.   
      
   And the 486 is already pipelined and can perform straight-line code at   
   1 CPI; the Pentium is superscalar, and can have up to 2 IPC (in   
   straight-line code).   
      
   >One can take an Alpha ISA and implement it with a microcoded sequencer   
   >but that should not be called RISC   
      
   Alpha is a RISC architecture. So this hypothetical implementation   
   would certainly be an implementation of a RISC architecture.   
      
   >RISC changes that design to one like a multi-threaded program with   
   >messages passing between them called uOps, where the dynamic state   
   >of each instruction is mostly carried with the uOp message,   
   >and each thread does something very simple and passes the uOp on.   
   >Where global resources are required, they are temporarily dynamically   
   >allocated to the uOp by the various threads, carried with the uOp,   
   >and returned later when the uOp message is passed to the Retire thread.   
   >The Retire thread is the only one which updates the visible global state.   
      
   This does not sound like RISC vs. non-RISC at all, but like OoO   
   microarchitecture, and the contrast would be an in-order execution   
   microarchitecture. Both RISCs and non-RISCs can make use of OoO   
   microarchitectures, and have done so.   
      
   >The RISC design guidelines described by various papers, rather than   
   >go/no-go decisions, are mostly engineering compromises for consideration   
   >of things which would make an MST-MPA more expensive to implement or   
   >otherwise interfere with maximizing the active concurrency of all threads.   
      
   The interesting aspect is that RISCs are easier to implement in simple   
   pipelines like the ones of early ARM, HPPA, MIPS and SPARC   
   implementations, but can also be implemented as in-order superscalar   
   or OoO superscalar microarchitectures; you can also implement it as   
   sequentially-executed microcode engine. Wolfgang Kleinert implemented   
   a microcoded RISC in the 1980s, but I think that it was pipelined.   
      
   The advantages from the instruction set diminish with the more complex   
   implementation techniques, and there are a number of instruction set   
   design decisions in early RISCs that turned out to be not so great and   
   that were eliminated in later RISCs (if not from the start), most   
   notably delayed branches, but many of the recent instruction sets (ARM   
   A64, RISC-V) take many of the same design decisions as the RISC   
   architectures of the 1980s (load/store, register architecture, etc.,   
   see John Mashey's criteria and recent discussions about this topic),   
   whereas many non-RISCs deviate from this design style.   
      
   >This is why I think it would have been possible to build a risc-style   
   >PDP-11 in 1975 TTL, or a VAX if they had just left the instructions of   
   >the same complexity as PDP-11 ISA (53 opcodes, max one immediate,   
   >max one mem op per instruction).   
      
   The PDP-11 instruction set is not RISC, and you paint a picture that   
   is too rosy: It has up to two mem ops per instruction, and IIRC even   
   memory-indirect addressing modes. Not a problem for the   
   physically-addressed first implementations, nasty as soon as you add   
   virtual memory.   
      
   Implementing a pipelined implementation of PDP-11 (like the 486 was   
   for IA-32) for PDP-11 would have been quite a bit harder than for the   
   486 (admittedly the 486 has to deal with 16-bit modes and other legacy   
   features, so it's not the easiest target, either).   
      
   For the VAX I would go for a RISC instead of a cleaned-up IA-32-like   
   instruction set, and then implement pipelining. I would rather put   
   the effort in implementing compressed instructions rather than   
   load-and-op or RMW instructions.   
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
    Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|