From: anton@mips.complang.tuwien.ac.at   
      
   Krishna Myneni writes:   
   >On 5/21/24 04:03, mhx wrote:   
   >> Anton Ertl wrote:   
   >>   
   >> [..]   
   >>> It seems to me that this can be solved by sorting the three factors   
   >>> into a>b>c. Then you can avoid the intermediate overflow by   
   >>> performing the computation as (a*c)*b.   
   ...   
   >Remember that you will also have to deal with IEEE 754 special values   
   >like Inf and NaN.   
      
   Not a problem. If any operand is a NaN, the result will be NaN no   
   matter how the operations are associated. For infinities (and 0 as   
   divisor), I would analyse it by looking at all cases, but I don't see   
   that it makes any difference:   
      
   Variable names here represent finite non-zero values:   
      
   (inf*y)/z=inf/z=inf   
   inf*(y/z)=inf*finite=inf   
   y*(inf/z)=y*inf=inf   
      
   Likewise if x is finite and y is infinite   
      
   (x*y)/inf=finite/inf=0   
   x*(y/inf)=x*0=0   
   y*(x/inf)=y*0=0   
      
   (x*y)/0=finite/0=inf   
   x*(y/0)=x*inf=inf   
   y*(x/0)=y*inf=inf   
      
   Signs in all these cases follow the same rules whether infinities are   
   involved or not.   
      
   >It will be interesting to compare the efficiency of   
   >both my approach and your sorting approach. I'm skeptical that the   
   >additional sorting will make the equivalent calculation faster.   
      
   Actually sorting is overkill:   
      
   : fsort2 ( r1 r2 -- r3 r4 )   
    \ |r3|>=|r4|   
    fover fabs fover fabs f< if   
    fswap   
    then ;   
      
   : f*/ ( r1 r2 r3 -- r )   
    fdup fabs 1e f> fswap frot fsort2 if   
    fswap then   
    frot f/ f* ;   
      
   I have tested this with your tests from   
   , but needed to change rel-near (I   
   changed it to 1e-16) for gforth to pass your tests. I leave   
   performance testing to you. Here's what vfx64 produces for this F*/:   
      
   see f*/   
   F*/   
   ( 0050A310 D9C0 ) FLD ST   
   ( 0050A312 D9E1 ) FABS   
   ( 0050A314 D9E8 ) FLD1   
   ( 0050A316 E8F5BEFFFF ) CALL 00506210 F>   
   ( 0050A31B D9C9 ) FXCH ST(1)   
   ( 0050A31D D9C9 ) FXCH ST(1)   
   ( 0050A31F D9CA ) FXCH ST(2)   
   ( 0050A321 E88AFFFFFF ) CALL 0050A2B0 FSORT2   
   ( 0050A326 4885DB ) TEST RBX, RBX   
   ( 0050A329 488B5D00 ) MOV RBX, [RBP]   
   ( 0050A32D 488D6D08 ) LEA RBP, [RBP+08]   
   ( 0050A331 0F8402000000 ) JZ/E 0050A339   
   ( 0050A337 D9C9 ) FXCH ST(1)   
   ( 0050A339 D9C9 ) FXCH ST(1)   
   ( 0050A33B D9CA ) FXCH ST(2)   
   ( 0050A33D DEF9 ) FDIVP ST(1), ST   
   ( 0050A33F DEC9 ) FMULP ST(1), ST   
   ( 0050A341 C3 ) RET/NEXT   
   ( 50 bytes, 18 instructions )   
    ok   
   see fsort2   
   FSORT2   
   ( 0050A2B0 D9C1 ) FLD ST(1)   
   ( 0050A2B2 D9E1 ) FABS   
   ( 0050A2B4 D9C1 ) FLD ST(1)   
   ( 0050A2B6 D9E1 ) FABS   
   ( 0050A2B8 E863BEFFFF ) CALL 00506120 F<   
   ( 0050A2BD 4885DB ) TEST RBX, RBX   
   ( 0050A2C0 488B5D00 ) MOV RBX, [RBP]   
   ( 0050A2C4 488D6D08 ) LEA RBP, [RBP+08]   
   ( 0050A2C8 0F8402000000 ) JZ/E 0050A2D0   
   ( 0050A2CE D9C9 ) FXCH ST(1)   
   ( 0050A2D0 C3 ) RET/NEXT   
   ( 33 bytes, 11 instructions )   
    ok   
   see f<   
   F<   
   ( 00506120 E86BFEFFFF ) CALL 00505F90 FCMP2   
   ( 00506125 4881FB00010000 ) CMP RBX, # 00000100   
   ( 0050612C 0F94C3 ) SETZ/E BL   
   ( 0050612F F6DB ) NEG BL   
   ( 00506131 480FBEDB ) MOVSX RBX, BL   
   ( 00506135 C3 ) RET/NEXT   
   ( 22 bytes, 6 instructions )   
    ok   
   see fcmp2   
   FCMP2   
   ( 00505F90 4883ED08 ) SUB RBP, # 08   
   ( 00505F94 48895D00 ) MOV [RBP], RBX   
   ( 00505F98 D9C9 ) FXCH ST(1)   
   ( 00505F9A DED9 ) FCOMPP   
   ( 00505F9C 9B ) FWAIT   
   ( 00505F9D DFE0 ) FSTSW AX   
   ( 00505F9F 66250041 ) AND AX, # 4100   
   ( 00505FA3 480FB7D8 ) MOVZX RBX, AX   
   ( 00505FA7 C3 ) RET/NEXT   
   ( 24 bytes, 9 instructions )   
      
   - anton   
   --   
   M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
    New standard: https://forth-standard.org/   
    EuroForth 2023: https://euro.theforth.net/2023   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|