home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.lang.forth      Forth programmers eat a lot of Bratwurst      117,927 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 116,492 of 117,927   
   Anton Ertl to Krishna Myneni   
   Re: F*/ (f-star-slash)   
   22 May 24 16:31:45   
   
   From: anton@mips.complang.tuwien.ac.at   
      
   Krishna Myneni  writes:   
   >On 5/21/24 04:03, mhx wrote:   
   >> Anton Ertl wrote:   
   >>   
   >> [..]   
   >>> It seems to me that this can be solved by sorting the three factors   
   >>> into a>b>c.  Then you can avoid the intermediate overflow by   
   >>> performing the computation as (a*c)*b.   
   ...   
   >Remember that you will also have to deal with IEEE 754 special values   
   >like Inf and NaN.   
      
   Not a problem.  If any operand is a NaN, the result will be NaN no   
   matter how the operations are associated.  For infinities (and 0 as   
   divisor), I would analyse it by looking at all cases, but I don't see   
   that it makes any difference:   
      
   Variable names here represent finite non-zero values:   
      
   (inf*y)/z=inf/z=inf   
   inf*(y/z)=inf*finite=inf   
   y*(inf/z)=y*inf=inf   
      
   Likewise if x is finite and y is infinite   
      
   (x*y)/inf=finite/inf=0   
   x*(y/inf)=x*0=0   
   y*(x/inf)=y*0=0   
      
   (x*y)/0=finite/0=inf   
   x*(y/0)=x*inf=inf   
   y*(x/0)=y*inf=inf   
      
   Signs in all these cases follow the same rules whether infinities are   
   involved or not.   
      
   >It will be interesting to compare the efficiency of   
   >both my approach and your sorting approach. I'm skeptical that the   
   >additional sorting will make the equivalent calculation faster.   
      
   Actually sorting is overkill:   
      
   : fsort2 ( r1 r2 -- r3 r4 )   
       \ |r3|>=|r4|   
       fover fabs fover fabs f< if   
           fswap   
       then ;   
      
   : f*/ ( r1 r2 r3 -- r )   
       fdup fabs 1e f> fswap frot fsort2 if   
           fswap then   
       frot f/ f* ;   
      
   I have tested this with your tests from   
   , but needed to change rel-near (I   
   changed it to 1e-16) for gforth to pass your tests.  I leave   
   performance testing to you.  Here's what vfx64 produces for this F*/:   
      
   see f*/   
   F*/   
   ( 0050A310    D9C0 )                  FLD     ST   
   ( 0050A312    D9E1 )                  FABS   
   ( 0050A314    D9E8 )                  FLD1   
   ( 0050A316    E8F5BEFFFF )            CALL    00506210  F>   
   ( 0050A31B    D9C9 )                  FXCH    ST(1)   
   ( 0050A31D    D9C9 )                  FXCH    ST(1)   
   ( 0050A31F    D9CA )                  FXCH    ST(2)   
   ( 0050A321    E88AFFFFFF )            CALL    0050A2B0  FSORT2   
   ( 0050A326    4885DB )                TEST    RBX, RBX   
   ( 0050A329    488B5D00 )              MOV     RBX, [RBP]   
   ( 0050A32D    488D6D08 )              LEA     RBP, [RBP+08]   
   ( 0050A331    0F8402000000 )          JZ/E    0050A339   
   ( 0050A337    D9C9 )                  FXCH    ST(1)   
   ( 0050A339    D9C9 )                  FXCH    ST(1)   
   ( 0050A33B    D9CA )                  FXCH    ST(2)   
   ( 0050A33D    DEF9 )                  FDIVP   ST(1), ST   
   ( 0050A33F    DEC9 )                  FMULP   ST(1), ST   
   ( 0050A341    C3 )                    RET/NEXT   
   ( 50 bytes, 18 instructions )   
    ok   
   see fsort2   
   FSORT2   
   ( 0050A2B0    D9C1 )                  FLD     ST(1)   
   ( 0050A2B2    D9E1 )                  FABS   
   ( 0050A2B4    D9C1 )                  FLD     ST(1)   
   ( 0050A2B6    D9E1 )                  FABS   
   ( 0050A2B8    E863BEFFFF )            CALL    00506120  F<   
   ( 0050A2BD    4885DB )                TEST    RBX, RBX   
   ( 0050A2C0    488B5D00 )              MOV     RBX, [RBP]   
   ( 0050A2C4    488D6D08 )              LEA     RBP, [RBP+08]   
   ( 0050A2C8    0F8402000000 )          JZ/E    0050A2D0   
   ( 0050A2CE    D9C9 )                  FXCH    ST(1)   
   ( 0050A2D0    C3 )                    RET/NEXT   
   ( 33 bytes, 11 instructions )   
    ok   
   see f<   
   F<   
   ( 00506120    E86BFEFFFF )            CALL    00505F90  FCMP2   
   ( 00506125    4881FB00010000 )        CMP     RBX, # 00000100   
   ( 0050612C    0F94C3 )                SETZ/E   BL   
   ( 0050612F    F6DB )                  NEG     BL   
   ( 00506131    480FBEDB )              MOVSX   RBX, BL   
   ( 00506135    C3 )                    RET/NEXT   
   ( 22 bytes, 6 instructions )   
    ok   
   see fcmp2   
   FCMP2   
   ( 00505F90    4883ED08 )              SUB     RBP, # 08   
   ( 00505F94    48895D00 )              MOV     [RBP], RBX   
   ( 00505F98    D9C9 )                  FXCH    ST(1)   
   ( 00505F9A    DED9 )                  FCOMPP   
   ( 00505F9C    9B )                    FWAIT   
   ( 00505F9D    DFE0 )                  FSTSW   AX   
   ( 00505F9F    66250041 )              AND     AX, # 4100   
   ( 00505FA3    480FB7D8 )              MOVZX   RBX, AX   
   ( 00505FA7    C3 )                    RET/NEXT   
   ( 24 bytes, 9 instructions )   
      
   - anton   
   --   
   M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html   
   comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html   
        New standard: https://forth-standard.org/   
      EuroForth 2023: https://euro.theforth.net/2023   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca