From: krishna.myneni@ccreweb.org   
      
   On 5/27/24 18:27, Krishna Myneni wrote:   
   > On 5/22/24 11:31, Anton Ertl wrote:   
   >> Krishna Myneni writes:   
   >>> On 5/21/24 04:03, mhx wrote:   
   >>>> Anton Ertl wrote:   
   >>>>   
   >>>> [..]   
   >>>>> It seems to me that this can be solved by sorting the three factors   
   >>>>> into a>b>c. Then you can avoid the intermediate overflow by   
   >>>>> performing the computation as (a*c)*b.   
   >> ...   
   >>> Remember that you will also have to deal with IEEE 754 special values   
   >>> like Inf and NaN.   
   >>   
   >> Not a problem. If any operand is a NaN, the result will be NaN no   
   >> matter how the operations are associated. For infinities (and 0 as   
   >> divisor), I would analyse it by looking at all cases, but I don't see   
   >> that it makes any difference:   
   >>   
   >> Variable names here represent finite non-zero values:   
   >>   
   >> (inf*y)/z=inf/z=inf   
   >> inf*(y/z)=inf*finite=inf   
   >> y*(inf/z)=y*inf=inf   
   >>   
   >> Likewise if x is finite and y is infinite   
   >>   
   >> (x*y)/inf=finite/inf=0   
   >> x*(y/inf)=x*0=0   
   >> y*(x/inf)=y*0=0   
   >>   
   >> (x*y)/0=finite/0=inf   
   >> x*(y/0)=x*inf=inf   
   >> y*(x/0)=y*inf=inf   
   >>   
   >> Signs in all these cases follow the same rules whether infinities are   
   >> involved or not.   
   >>   
   >>> It will be interesting to compare the efficiency of   
   >>> both my approach and your sorting approach. I'm skeptical that the   
   >>> additional sorting will make the equivalent calculation faster.   
   >>   
   >> Actually sorting is overkill:   
   >>   
   >> : fsort2 ( r1 r2 -- r3 r4 )   
   >> \ |r3|>=|r4|   
   >> fover fabs fover fabs f< if   
   >> fswap   
   >> then ;   
   >>   
   >> : f*/ ( r1 r2 r3 -- r )   
   >> fdup fabs 1e f> fswap frot fsort2 if   
   >> fswap then   
   >> frot f/ f* ;   
   >>   
   >> I have tested this with your tests from   
   >> , but needed to change rel-near (I   
   >> changed it to 1e-16) for gforth to pass your tests. I leave   
   >> performance testing to you. ...   
   >   
   > I put together a benchmark test. My implementation of F*/ hangs during   
   > the test, so I have some debugging to do on that. Your implementation   
   > above does execute.   
   >   
   > --   
   > Krishna   
   >   
   >   
   > === begin bench-fstarslash.4th ===   
   > \ bench-fstarslash.4th   
   > \   
   > \ Benchmark implementation of double-precision F*/ using   
   > \ a large sample of triplet dp floating-point numbers,   
   > \ which would overflow/underflow with F* F/ sequence.   
   > \   
      
   Updated benchmark code also allows for validating the arithmetic of F*/   
   for all of the 1 million test cases. At the moment, all arithmetic cases   
   use only "normal" fp numbers i.e. they do not include zero, infinity, or   
   subnormal numbers -- special tests are needed for such cases.   
      
   --   
   Krishna   
      
   === begin bench-fstarslash.4th ===   
   \ bench-fstarslash.4th   
   \   
   \ Benchmark implementation of double-precision F*/ using   
   \ a large sample of triplet dp floating-point numbers,   
   \ which would overflow/underflow with F* F/ sequence.   
   \   
      
   include ans-words \ only for kForth   
   include random   
   include fsl/fsl-util   
   include fsl/dynmem   
      
   1 [IF]   
      
   \ Anton Ertl's implementation of F*/ per discussion on   
   \ comp.lang.forth: 5/22/2024, "Re: F*/ (f-star-slash)"   
      
   : fsort2 ( F: r1 r2 -- r3 r4 )   
    \ |r3|>=|r4|   
    fover fabs fover fabs f< if   
    fswap   
    then ;   
      
   : f*/ ( F: r1 r2 r3 -- r )   
    fdup fabs 1e f> fswap frot fsort2 if   
    fswap then   
    frot f/ f* ;   
      
   [ELSE]   
   include fstarslash \ other implementation   
   [THEN]   
      
   \ Generate triplet:   
   \   
   \ 1. Use a integer random number generator to generate   
   \ binary signed exponents, e1, e2, e3, in the interval   
   \   
   \ [-DP_EXPONENT_BIAS, MAX_DP_EXPONENT-DP_EXPONENT_BIAS)   
   \   
   \ 2. Obtain the triplet, 2^{e1--e3}, such that it is in   
   \ the normal range for double-precision fp.   
   \   
   \ 3. Store in matrix and repeat from 1 for N_SAMPLES.   
   \   
   \ 4. Benchmark results of F*/ on triplets; store in   
   \ results array.   
   \   
      
   1014678321 seed !   
      
   [UNDEFINED] DP_EXPONENT_MAX [IF]   
   2047 constant DP_EXPONENT_MAX   
   [THEN]   
      
   DP_EXPONENT_MAX constant RAND_EXP_MASK   
      
   [UNDEFINED] DP_EXPONENT_BIAS [IF]   
   1023 constant DP_EXPONENT_BIAS   
   [THEN]   
      
   1000000 constant N_SAMPLES   
      
   FLOAT DMATRIX triplets{{   
   FLOAT DARRAY results{   
   INTEGER DMATRIX refexp{{   
      
   : 3dup ( n1 n2 n3 -- n1 n2 n3 n1 n2 n3 )   
    2 pick 2 pick 2 pick ;   
      
   : alloc-mem ( u -- bFail )   
    & triplets{{ over 3 }}malloc   
    & results{ over }malloc   
    & refexp{{ swap 3 }}malloc   
    malloc-fail? ;   
      
   : free-mem ( -- )   
    & refexp{{ }}free   
    & results{ }free   
    & triplets{{ }}free   
      
      
   : normal-exponent? ( e -- bNormal )   
    DP_EXPONENT_BIAS negate 1+ DP_EXPONENT_BIAS 1+ within ;   
      
   \ Return signed binary exponent in normal range   
   : rand-exp ( -- e )   
    random2 RAND_EXP_MASK and \ ensure e stays within range   
    DP_EXPONENT_BIAS -   
    DP_EXPONENT_BIAS min ;   
      
   : random-exponents ( -- e1 e2 e3 )   
    BEGIN   
    rand-exp rand-exp   
    2dup + normal-exponent?   
    WHILE   
    2drop   
    REPEAT   
    \ -- e1 e2   
    \ (2^e1)*(2^e2) will underflow or overflow double fp under F*   
    BEGIN   
    2dup +   
    rand-exp 2dup -   
    normal-exponent? 0=   
    WHILE   
    2drop   
    REPEAT   
    nip   
      
      
   : combine-exponents ( e1 e2 e3 -- e1 e2 e3 etot )   
    2dup - 3 pick + ;   
      
   : randomize-sign ( F: r -- +/-r )   
    random2 1 and IF fnegate THEN ;   
      
   : etriple>floats ( e1 e2 e3 -- ) ( F: -- r1 r2 r3 )   
    rot 2.0e0 s>f f**   
    swap 2.0e0 s>f f**   
    2.0e0 s>f f** ;   
      
   \ GEN-SAMPLES does not generate cases with zeros or   
   \ infinities.   
   : gen-samples ( -- )   
    N_SAMPLES alloc-mem ABORT" Unable to allocate memory!"   
    N_SAMPLES 0 DO   
    random-exponents   
    3dup refexp{{ I 2 }} ! refexp{{ I 1 }} ! refexp{{ I 0 }} !   
    etriple>floats   
    triplets{{ I 2 }} randomize-sign f!   
    triplets{{ I 1 }} randomize-sign f!   
    triplets{{ I 0 }} randomize-sign f!   
    LOOP ;   
      
   \ For Gforth, include kforth-compat.fs, or replace MS@ with UTIME   
   \ and double arithmetic to obtain elapsed time.   
      
   : bench-f*/ ( -- )   
    ms@   
    results{ 0 }   
    N_SAMPLES 0 DO   
    triplets{{ I 0 }}   
    dup f@   
    float+ dup f@   
    float+ f@   
    f*/   
    dup f! float+   
    LOOP   
    drop   
    ms@ swap - . ." ms" cr ;   
      
      
   \ Check the calculation performed by F*/   
   : relative-error ( F: r1 r2 -- relerror )   
    fswap fover f- fswap f/ fabs ;   
      
   1e-17 fconstant tol   
   variable tol-errors   
   variable sign-errors   
   fvariable max-rel-error   
      
   : check-results ( -- )   
    cr N_SAMPLES 7 .r ." samples" cr   
    0 tol-errors !   
    0 sign-errors !   
    0.0e0 max-rel-error f!   
    N_SAMPLES 0 DO   
    results{ I } f@ fabs   
    2.0e0 refexp{{ I 0 }} @ refexp{{ I 1 }} @ +   
    refexp{{ I 2 }} @ - s>f f**   
    relative-error fdup tol f> IF   
    max-rel-error f@ fmax max-rel-error f!   
    1 tol-errors +!   
    ELSE   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|