... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.forth
Forth programmers eat a lot of Bratwurst
117,927 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 116,499 of 117,927
Krishna Myneni to Anton Ertl
Re: F*/ (f-star-slash)
27 May 24 18:27:20
   From: krishna.myneni@ccreweb.org   
      
   On 5/22/24 11:31, Anton Ertl wrote:   
   > Krishna Myneni  writes:   
   >> On 5/21/24 04:03, mhx wrote:   
   >>> Anton Ertl wrote:   
   >>>   
   >>> [..]   
   >>>> It seems to me that this can be solved by sorting the three factors   
   >>>> into a>b>c.  Then you can avoid the intermediate overflow by   
   >>>> performing the computation as (a*c)*b.   
   > ...   
   >> Remember that you will also have to deal with IEEE 754 special values   
   >> like Inf and NaN.   
   >   
   > Not a problem.  If any operand is a NaN, the result will be NaN no   
   > matter how the operations are associated.  For infinities (and 0 as   
   > divisor), I would analyse it by looking at all cases, but I don't see   
   > that it makes any difference:   
   >   
   > Variable names here represent finite non-zero values:   
   >   
   > (inf*y)/z=inf/z=inf   
   > inf*(y/z)=inf*finite=inf   
   > y*(inf/z)=y*inf=inf   
   >   
   > Likewise if x is finite and y is infinite   
   >   
   > (x*y)/inf=finite/inf=0   
   > x*(y/inf)=x*0=0   
   > y*(x/inf)=y*0=0   
   >   
   > (x*y)/0=finite/0=inf   
   > x*(y/0)=x*inf=inf   
   > y*(x/0)=y*inf=inf   
   >   
   > Signs in all these cases follow the same rules whether infinities are   
   > involved or not.   
   >   
   >> It will be interesting to compare the efficiency of   
   >> both my approach and your sorting approach. I'm skeptical that the   
   >> additional sorting will make the equivalent calculation faster.   
   >   
   > Actually sorting is overkill:   
   >   
   > : fsort2 ( r1 r2 -- r3 r4 )   
   >      \ |r3|>=|r4|   
   >      fover fabs fover fabs f< if   
   >          fswap   
   >      then ;   
   >   
   > : f*/ ( r1 r2 r3 -- r )   
   >      fdup fabs 1e f> fswap frot fsort2 if   
   >          fswap then   
   >      frot f/ f* ;   
   >   
   > I have tested this with your tests from   
   > , but needed to change rel-near (I   
   > changed it to 1e-16) for gforth to pass your tests.  I leave   
   > performance testing to you.  ...   
      
   I put together a benchmark test. My implementation of F*/ hangs during   
   the test, so I have some debugging to do on that. Your implementation   
   above does execute.   
      
   --   
   Krishna   
      
      
   === begin bench-fstarslash.4th ===   
   \ bench-fstarslash.4th   
   \   
   \ Benchmark implementation of double-precision F*/ using   
   \ a large sample of triplet dp floating-point numbers,   
   \ which would overflow/underflow with F* F/ sequence.   
   \   
      
   include ans-words \ only for kForth   
   include random   
   include fsl/fsl-util   
   include fsl/dynmem   
      
   1 [IF]   
      
   \ Anton Ertl's implementation of F*/ per discussion on   
   \ comp.lang.forth: 5/22/2024, "Re: F*/ (f-star-slash)"   
      
   : fsort2 ( F: r1 r2 -- r3 r4 )   
        \ |r3|>=|r4|   
        fover fabs fover fabs f< if   
            fswap   
        then ;   
      
   : f*/ ( F: r1 r2 r3 -- r )   
        fdup fabs 1e f> fswap frot fsort2 if   
            fswap then   
        frot f/ f* ;   
      
   [ELSE]   
   include fstarslash  \ other implementation   
   [THEN]   
      
   \ Generate triplet:   
   \   
   \ 1. Use a integer random number generator to generate   
   \    binary signed exponents, e1, e2, e3, in the interval   
   \   
   \    [-DP_EXPONENT_BIAS, MAX_DP_EXPONENT-DP_EXPONENT_BIAS)   
   \   
   \ 2. Obtain the triplet, 2^{e1--e3}, such that it is in   
   \    the normal range for double-precision fp.   
   \   
   \ 3. Store in matrix and repeat from 1 for N_SAMPLES.   
   \   
   \ 4. Benchmark results of F*/ on triplets; store in   
   \    results array.   
   \   
      
   1014678321 seed !   
      
   [UNDEFINED] DP_EXPONENT_MAX [IF]   
   2047 constant DP_EXPONENT_MAX   
   [THEN]   
      
   DP_EXPONENT_MAX constant RAND_EXP_MASK   
      
   [UNDEFINED] DP_EXPONENT_BIAS [IF]   
   1023 constant DP_EXPONENT_BIAS   
   [THEN]   
      
   1000000 constant N_SAMPLES   
      
   FLOAT DMATRIX triplets{{   
   FLOAT DARRAY  results{   
      
   : alloc-mem ( u -- bFail )   
        & triplets{{ over 3 }}malloc   
        & results{ swap }malloc   
        malloc-fail? ;   
      
   : free-mem ( -- )   
        & results{ }free   
        & triplets{{ }}free   
      
      
   \ Return signed binary exponent   
   : rand-exp ( -- e )   
        random2 RAND_EXP_MASK and  \ ensure e stays within range   
        DP_EXPONENT_BIAS - ;   
      
   : normal-exponent? ( e -- bNormal )   
        DP_EXPONENT_BIAS negate DP_EXPONENT_BIAS within ;   
      
   : random-exponents ( -- e1 e2 e3 )   
        BEGIN   
          rand-exp rand-exp   
          2dup + normal-exponent?   
        WHILE   
          2drop   
        REPEAT   
        \ -- e1 e2   
        \ (2^e1)*(2^e2) will underflow or overflow double fp under F*   
        BEGIN   
          2dup +   
          rand-exp 2dup -   
          normal-exponent? 0=   
        WHILE   
          2drop   
        REPEAT   
        nip   
      
      
   : randomize-sign ( F: r -- +/-r )   
        random2 1 and IF fnegate THEN ;   
      
   : gen-triplet ( F: -- r1 r2 r3 )   
        random-exponents   
        2.0e0 s>f f** randomize-sign   
        2.0e0 s>f f** randomize-sign   
        2.0e0 s>f f** randomize-sign ;   
      
   : gen-samples ( -- )   
        N_SAMPLES alloc-mem ABORT" Unable to allocate memory!"   
        N_SAMPLES 0 DO   
          gen-triplet   
          triplets{{ I 2 }} f! triplets{{ I 1 }} f! triplets{{ I 0 }} f!   
          key? IF key drop I . cr unloop EXIT THEN   
        LOOP ;   
      
   \ For Gforth, include kforth-compat.fs, or replace MS@ with UTIME   
   \ and double arithmetic to obtain elapsed time.   
      
   : bench-f*/ ( -- )   
        ms@   
        results{ 0 }   
        N_SAMPLES 0 DO   
          triplets{{ I 0 }}   
          dup f@   
          float+ dup f@   
          float+ f@   
          f*/   
          dup f! float+   
       LOOP   
       drop   
       ms@ swap - . ." ms" cr ;   
      
   cr   
   cr .( Type 'gen-samples' to obtain arithmetic cases for F*/' )   
   cr .( Type bench-f*/ to execute the benchmark. )   
   cr   
   cr   
   === end bench-fstarslash.4th ===   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]