home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.arch      Apparently more than just beeps & boops      131,241 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 129,481 of 131,241   
   MitchAlsup to All   
   Re: What I did on my summer vacation   
   23 Aug 25 22:44:42   
   
   From: user5857@newsgrouper.org.invalid   
      
   BGB  posted:   
      
   > On 8/23/2025 10:11 AM, Terje Mathisen wrote:   
   > > BGB wrote:   
   -------------   
   > >   
   > > Mitch and I have repeated this too many times already:   
   > >   
   > > If you are implementing a current-standards FPU, including FMAC support,   
   > > then you already have the very wide normalizer which is the only   
   > > expensive item needed to allow zero-cycle denorm cost.   
   > >   
   >   
   > Errm, no single-rounded FMA in my case, as single rounded FMA (for   
   > Binary64) would also require Trap-and-Emulate...   
   >   
   > But, yeah, Free if you have FMA, is not the same as FMA being free.   
   >   
   > Partial issue is that single rounded FMA would effectively itself have   
   > too high of cost (and an FMA unit would require higher latency than   
   > separate FMUL and FADD units).   
      
   FMA latency < (FMUL + FADD) latency   
   FMA latency >= FMUL latency   
   FMA latency >= FADD latency   
      
   > Ironically, what FMA operations exist tend to be slower for Binary32 ops   
   > than using separate MUL and ADD ops in the default (non-IEEE) mode.   
   > Though for Binary64, it would be slightly faster, though still   
   > double-rounded-ish. They can mimic Single-Rounded behavior with Binary32   
   > and Binary16 though mostly for sake of internally operating on Binary64.   
      
   You must accept that::   
      
        FMA   Rd,Rs1,Rs2,Rs3   
        FSUB  Re,Rd,Rs3   
      
   leaves all the proper bits in Re; whereas you cannot even argue::   
      
       FMUL   Rd,Rs1,Rs2   
       FADD   Re,Rd,Rs3   
       RSUB   Re,Re,R3   
      
   leaves all the proper bits in Re !! in all cases !!   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca