From: terje.mathisen@tmsw.no   
      
   Thomas Koenig wrote:   
   > Terje Mathisen schrieb:   
   >> Anton Ertl wrote:   
   >>> EricP writes:   
   >>>> I see the difference between CISC and RISC as in the micro-architecture,   
   >>>   
   >>> But the microarchitecture is not an architectural criterion.   
   >>>   
   >>>> changing from a single sequential state machine view to multiple   
   concurrent   
   >>>> machines view, and from Clocks Per Instruction to Instructions Per Clock.   
   >>>   
   >>> People changed from talking CPI to IPC when CPI started to go below 1.   
   >>> That's mainly a distinction between single-issue and superscalar CPUs.   
   >>>   
   >>>> The monolithic microcoded machine, which covers 360, 370, PDP-11, VAX,   
   >>>> 386, 486 and Pentium, is like a single threaded program which   
   >>>> operates sequentially on a single global set of state variables.   
   >>>> While there is some variation and fuzziness around the edges,   
   >>>> the heart of each of these are single sequential execution engines.   
   >>>   
   >>> The same holds true for the MIPS R2000, the ARM1/2 (and probably many   
   >>> successors), probably early SPARCs and early HPPA CPUs, all of which   
   >>> are considered as RISCs. Documents about them also talk about CPI.   
   >>>   
   >>> And the 486 is already pipelined and can perform straight-line code at   
   >>> 1 CPI; the Pentium is superscalar, and can have up to 2 IPC (in   
   >>> straight-line code).   
   >>   
   >> Maybe relevant:   
   >>   
   >> Performance optimizers writing asm regularly hit that 1 IPC on the 486   
   >> and (with more difficulty) 2 IPC on the Pentium.   
   >>   
   >> When we did get there, the final performance was typically 3X compiled C   
   >> code.   
   >>   
   >> That 3X gap almost went away (maybe 1.2 to 1.5X for many algorithms) on   
   >> the PPro and later OoO CPUs.   
   >   
   > And then came back with SIMD, I presume? :-)   
      
   Sure!   
      
   I typically got 3X SIMD speedup from 4-way processing, years before any   
   compilers were able to autovectorize to again partly close the gap.   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|