... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,373 of 131,241
Anton Ertl to Waldek Hebisch
Re: VAX
12 Aug 25 15:59:32
   From: anton@mips.complang.tuwien.ac.at   
      
   antispam@fricas.org (Waldek Hebisch) writes:   
   >The basic question is if VAX could afford the pipeline.   
      
   VAX 11/780 only performed instruction fetching concurrently with the   
   rest (a two-stage pipeline, if you want).  The 8600, 8700/8800 and   
   NVAX applied more pipelining, but CPI remained high.   
      
   VUPs MHz   CPI    Machine   
     1    5    10    11/780   
     4   12.5   6.25 8600   
     6   22.2   7.4  8700   
    35   90.9   5.1  NVAX+   
      
   SPEC92  MHz VAX CPI Machine   
     1/1     5  10/10  VAX 11/780   
   133/200 200   3/2   Alpha 21064 (DEC 7000 model 610)   
      
   VUPs and SPEC numbers from   
   .   
      
   The 10 CPI (cycles per instructions) of the VAX 11/780 are annecdotal.   
   The other CPIs are computed from VUP/SPEC and MHz numbers; all of that   
   is probably somewhat off (due to the anecdotal base being off), but if   
   you relate them to each other, the offness cancels itself out.   
      
   Note that the NVAX+ was made in the same process as the 21064, the   
   21064 has about the clock rate, and has 4-6 times the performance,   
   resulting not just in a lower native CPI, but also in a lower "VAX   
   CPI" (the CPI a VAX would have needed to achieve the same performance   
   at this clock rate).   
      
   >I doubt that they could afford 1-cycle multiply   
      
   Yes, one might do a multiplier and divider with its own sequencer (and   
   more sophisticated in later implementations), and with any user of the   
   result waiting stalling the pipeline until that is complete, and any   
   following user of the multiplier or divider stalling the pipeline   
   until it is free again.   
      
   The idea of providing multiply-step instructions and using a bunch of   
   them was short-lived; already the MIPS R2000 included a multiply   
   instruction (with its own sequencer), HPPA has multiply-step as well   
   as an FPU-based multiply from the start.  The idea of avoiding divide   
   instructions had a longer life.  MIPS has divide right from the start,   
   but Alpha and even IA-64 avoided it.  RISC-V includes divide in the M   
   extension that also gives multiply.   
      
   >or   
   >even a barrel shifter.   
      
   Five levels of 32-bit 2->1 muxes might be doable, but would that be   
   cost-effecti   
      
   >It is accepted in this era that using more hardware could   
   >give substantial speedup.  IIUC IBM used quadatic rule:   
   >performance was supposed to be proportional to square of   
   >CPU price.  That was partly marketing, but partly due to   
   >compromises needed in smaller machines.   
      
   That's more of a 1960s thing, probably because low-end S/360   
   implementations used all (slow) tricks to minimize hardware.  In the   
   VAX 11/780 environment, I very much doubt that it is true.  Looking at   
   the early VAXen, you get the 11/730 with 0.3 VUPs up to the 11/784   
   with 3.5 VUPs (from 4 11/780 CPUs).  sqrt(3.5/0.3)=3.4.  I very much   
   doubt that you could get an 11/784 for 3.4 times the price of an   
   11/730.   
      
   Searching a little, I find   
      
   |[11/730 is] to be a quarter the price and a quarter the performance of   
   |a grown-up VAX (11/780)   
      
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
     Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]