From: already5chosen@yahoo.com   
      
   On Tue, 06 Jan 2026 19:35:34 GMT   
   MitchAlsup wrote:   
      
   > Michael S posted:   
   >   
   > > On Tue, 06 Jan 2026 17:59:33 GMT   
   > > MitchAlsup wrote:   
   > >   
   > > > Terje Mathisen posted:   
   > > >   
   > > > > Michael S wrote:   
   > > > > > On Tue, 6 Jan 2026 12:35:20 +0100   
   > > > > > Terje Mathisen wrote:   
   > > > > >   
   > > > > >> Thomas Koenig wrote:   
   > > > > >>> Michael S schrieb:   
   > > > > >>>> On Sun, 4 Jan 2026 00:21:31 -0000 (UTC)   
   > > > > >>>> Thomas Koenig wrote:   
   > > > > >>>>   
   > > > > >>>>>   
   > > > > >>>>> And two decimal flavors, as well, with binary and densely   
   > > > > >>>>> packed decimal encoding of the significand... it's a bit   
   > > > > >>>>> of a mess.   
   > > > > >>>>   
   > > > > >>>> Since both formats have exactly identical semantics, in   
   > > > > >>>> theory the mess is not worse (and not better) than two   
   > > > > >>>> bytes orders of IEEE binary FP.   
   > > > > >>>   
   > > > > >>> Almost.   
   > > > > >>>   
   > > > > >>> IIRC, there is no restriction on the binary mantissa, so its   
   > > > > >>> range is slightly larger for the same number of bits   
   > > > > >>> (1000/1024)**(n/3).   
   > > > > >>>   
   > > > > >> Sorry, that's wrong:   
   > > > > >>   
   > > > > >> Just like the 24 "spare" DPD patterns are illegal,   
   > > > > >   
   > > > > > Non-canonical, which is not the same as illegal   
   > > > > > Silently accepted as input operands but never produced as   
   > > > > > result.   
   > > > > >> any mantissa   
   > > > > >> corresponding to a number greater than the maximum allowed   
   > > > > >> (1e34 afair) is also illegal, and there are rules for how to   
   > > > > >> handle both cases (without checking, i seem to remember that   
   > > > > >> they should be treated as zero?)   
   > > > > >>   
   > > > > >   
   > > > > > BID significand extension > max is indeed treated as zeros.   
   > > > > > Non-canonical DPD declets have non-zero values.   
   > > > > > They are forms of (8+c)*100 + (8+f)*10 + (8+i), where c, f,   
   > > > > > and i are in range [0:1].   
   > > > >   
   > > > > OK, that is probably because allowing them on input is   
   > > > > significantly faster/cheaper than having to detect and   
   > > > > modify/trap/erase.   
   > > >   
   > > > With the calculation latencies of IBM Z-series, modify/trap/erase   
   > > > is of no problem.   
   > > >   
   > >   
   > > How do you know calculation latencies of IBM Z-series?   
   > > Did they made an information public?   
   >   
   > 9-15 months ago there was a presentation of their latest mainframe   
   > showing the pipeline lengths.   
   >   
   > Decode was on the order of 20 cycles, down from the top left;   
   > execute was horizontal across the middle;   
   > Retire was on the order of 12 cycles, down from the top right;   
   >   
      
   Back 20 years ago Intel used to have pipelines of comparable depth   
   (IIRC, ~35 cycles in the 3rd and 4th generations of Pentium 4). But   
   despite that, latency of simple ALU ops was 1 clock. Latency of L1D hit   
   was 4 clocks, long for 2005, but standard today. Latencies of FMUL   
   and FADD were 7 and 5 clocks, respectively - long, but not   
   extraordinary.   
      
   IBM's own POWER6 18 years ago had integer pipeline close to 30 stages   
   and FP pipeline of around 35 stages. However FP MUL/ADD/FMA latency   
   was 6 or 7 clocks.   
      
   I would expect similar or shorter latency figures for BFP on modern IBM   
   z. Likely shorter, because today they have far more silicon to through   
   on various bypasses.   
   Now, in case of DFP I don't want to guess, because I have no base for   
   guessing.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|