From: user5857@newsgrouper.org.invalid   
      
   BGB posted:   
      
   > Well, idea here is that sometimes one wants to be able to do   
   > floating-point math where accuracy is a very low priority.   
   >   
   > Say, the sort of stuff people might use FP8 or BF16 or maybe Binary16   
   > for (though, what I am thinking of here is low-precision even by   
   > Binary16 standards).   
      
   For 8-bit stuff, just use 5 memory tables [256×256]   
      
   > But, will use Binary16 and BF16 as the example formats.   
   >   
   > So, can note that one can approximate some ops with modified integer   
   > ADD/SUB (excluding sign-bit handling):   
   > a*b : A+B-0x3C00 (0x3F80 for BF16)   
   > a/b : A-B+0x3C00   
   > sqrt(a): (A>>1)+0x1E00   
      
   You are aware that GPUs perform elementary transcendental functions   
   (32-bits) in 5 cycles {sin(), cos(), tan(), exp(), ln(), ...}.   
   These functions get within 1.5-2 ULP. See authors: Oberman, Pierno,   
   Matula circa 2000-2005 for relevant data. I did a crack at this   
   (patented: Samsung) that got within 0.7 and 1.2 ULP using a three   
   term polynomial instead of a 2 term polynomial.   
   Standard GPU FP math (32-bit and 16-bit) are 4 cycles and are now   
   IEEE 754 accurate (except for a couple of outlying cases.)   
      
   So, I don't see this suggestion bringing value to the table.   
      
   > The harder ones though, are ADD/SUB.   
   >   
   > A partial ADD seems to be:   
   > a+b: A+((B-A)>>1)+0x0400   
   >   
   > But, this simple case seems not to hold up when either doing subtract,   
   > or when A and B are far apart.   
   >   
   > So, it would appear either that there is a 4th term or the bias is   
   > variable (depending on the B-A term; and for ADD/SUB).   
   >   
   > Seems like the high bits (exponent and operator) could be used to drive   
   > a lookup table, but this is lame, The magic bias appears to have   
   > non-linear properties so isn't as easily represented with basic integer   
   > operations.   
   >   
   > Then again, probably other people know about all of this and might know   
   > what I am missing.   
      
   I still recommend getting the right answer over getting a close but wrong   
   answer a couple cycles earlier.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|