... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.lang.forth
Forth programmers eat a lot of Bratwurst
117,927 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 116,517 of 117,927
Krishna Myneni to Krishna Myneni
Re: F*/ (f-star-slash) (1/2)
31 May 24 22:57:18
   From: krishna.myneni@ccreweb.org   
      
   On 5/29/24 07:27, Krishna Myneni wrote:   
   > On 5/28/24 06:11, Krishna Myneni wrote:   
   > ...   
   ...   
   > Now the benchmark code runs. It shows two things (see below):   
   >   
   > 1. The execution time is poor compared to Anton's implementation, about   
   > 8x slower.   
   >   
   > 2. Half of the arithmetic results are wrong, with the max relative error   
   > being as high as 2/3.   
   >   
   > So, I clearly have some work to do to fix the arithmetic. It was   
   > fortuitous that it passed my test cases, but clearly the test cases in   
   > my original implementation of F*/ were insufficient.   
   >   
   > --   
   > Krishna   
   >   
   >   
   > bench-f*/d   
   > 603 ms   
   >   ok   
   > check-results   
   >   
   > 1000000 samples   
   >   
   > # tolerance errors: 501374   
   > # sign errors:           0   
   > Max relative error: 0.666667   
   >   ok   
   >   
      
   Ok. I found and fixed the errors in my Forth implementation of F*/ . It   
   runs the benchmark test and now there are no arithmetic errors to within   
   the specified tolerance of 1E-17.   
      
   My Forth implementation is much slower than Anton's, which uses the FPU   
   floating point operations. However, my Forth implementation of F*/ makes   
   no use of floating point arithmetic, and will likely be considerably   
   faster in assembly code than in Forth due to the low level bit operations.   
      
   There is also a restriction currently on passing subnormal numbers to my   
   version of F*/ -- as it stands, it will give incorrect results for   
   subnormal numbers and won't detect this condition.   
      
   I also found an error in the benchmark code (bench-fstarslash.4th), in   
   the word,   
      
   RAND-EXP ( -- e )  \ Return signed binary exponent in normal range   
      
   It was allowing return of biased binary exponent of -1023 which is not   
   in the normal range. Anton's implementation works with subnormal   
   numbers, so it passed the results verification check.   
      
   The updated definition of the random exponent generator for   
   bench-fstarslash.4th is   
      
   \ Return signed binary exponent in normal range   
   : rand-exp ( -- e )   
        random2 RAND_EXP_MASK and   
        1 max  DP_EXPONENT_MAX 1- min \ ensure e within normal range   
        DP_EXPONENT_BIAS - ;   
      
      
   My current implementation, which passes the results verification test in   
   bench-fstarslash.4th (with the above revised def of RAND-EXP), is listed   
   below, along with a sample run.   
      
   Cheers,   
   Krishna   
      
   --   
      
   === begin fstarslash.4th ===   
   \ fstarslash.4th   
   \   
   \ Implementation of F*/ on 64-bit Forth   
   \   
   \ Krishna Myneni, 2024-05-19   
   \ Revised: 2024-05-31   
   \   
   \ F*/ ( F: r1 r2 r3 -- r )   
   \ Multiply two IEEE-754 double precision floating point   
   \ numbers r1 and r2 and divide by r3 to obtain the result,   
   \ r. The intermediate product significand, r1*r2, is   
   \ represented with at least IEEE quad precision for both   
   \ its significand and exponent.   
   \   
   \ F*/ uses an intermediate significand represented by   
   \   128 bits   
   \   
   \ Notes:   
   \  1. Does not work with subnormal IEEE 754 double-   
   \     precision numbers. Needs tests for subnormal   
   \     inputs.   
      
   \ include ans-words   
      
   1 cells 8 < [IF]   
      cr .( Requires 64-bit Forth system ) cr   
      ABORT   
   [THEN]   
      
   fvariable temp   
   : >fpstack ( u -- ) ( F: -- r )   
        temp ! temp f@ ;   
      
   base @ hex   
   8000000000000000  constant  DP_SIGNBIT_MASK   
   000FFFFFFFFFFFFF  constant  DP_FRACTION_MASK   
   7FF0000000000000  constant  DP_EXPONENT_MASK   
   \ DP special value constants   
   7FF0000000000001  >fpstack fconstant DP_NAN   
   FFF0000000000000  >fpstack fconstant DP_-INF   
   7FF0000000000000  >fpstack fconstant DP_+INF   
   base !   
      
      52 constant DP_FRACTION_BITS   
   1023 constant DP_EXPONENT_BIAS   
   2047 constant DP_EXPONENT_MAX   
      
   1 DP_FRACTION_BITS LSHIFT constant DP_IMPLIED_BIT   
   63 constant DP_SIGN_BIT   
      
   : dp-fraction ( F: r -- r ) ( -- u )   
       FP@ @ DP_FRACTION_MASK and ;   
   : dp-exponent ( F: r -- r ) ( -- e )   
       FP@ @ DP_EXPONENT_MASK and DP_FRACTION_BITS rshift ;   
   : dp-sign ( F: r -- r ) ( -- sign )   
       FP@ @ DP_SIGN_BIT rshift ;   
      
   : _isZero? ( ufrac uexp -- b )  0= swap 0= and ;   
   : _isInf?  ( ufrac uexp -- b )  DP_EXPONENT_MAX = swap 0= and ;   
   : _isNaN?  ( ufrac uexp -- b )  DP_EXPONENT_MAX = swap 0<> and ;   
   : _isSubNormal? ( ufrac uexp -- b ) 0= swap 0<> and ;   
      
   : frac>significand ( u -- u' ) DP_IMPLIED_BIT or ;   
      
   \ Show the significand, unbiased base-2 exponent, and sign bit   
   \ for a double-precision floating point number. To recover the   
   \ dp floating point number, r, from these fields,   
   \   
   \   r = (-1)^sign * significand * 2^(exponent - 52)   
   \   
   : dp-components. ( F: r -- )   
        dp-sign   
        dp-exponent   
        dp-fraction   
        2dup D0= invert IF  frac>significand THEN   
        swap DP_EXPONENT_BIAS - swap   
        fdrop   
        ." significand = " 18 u.r 2 spaces   
        ." exponent = "     4  .r 2 spaces   
        ." sign = " .   
      
      
   : dp-normal-significand ( u -- expinc u' )   
       dup DP_FRACTION_BITS 1+ rshift   
       dup IF   
         tuck rshift   
       ELSE   
         drop 0 swap \ 0 u   
         BEGIN   
           dup DP_IMPLIED_BIT and 0=   
         WHILE   
           1 lshift   
           swap 1- swap   
         REPEAT   
       THEN   
      
      
     0 value r1_sign   
     0 value r2_sign   
     0 value r3_sign   
     0 value num_sign   
   variable r1_exp   
   variable r2_exp   
   variable r3_exp   
   variable r_exp   
   variable r1_frac   
   variable r2_frac   
   variable r3_frac   
      
   : get-arg-components ( F: r1 r2 r3 -- )   
        dp-fraction r3_frac !   
        dp-exponent r3_exp !   
        dp-sign to r3_sign   
        fdrop   
        dp-fraction r2_frac !   
        dp-exponent r2_exp !   
        dp-sign to r2_sign   
        fdrop   
        dp-fraction r1_frac !   
        dp-exponent r1_exp !   
        dp-sign to r1_sign   
        fdrop ;   
      
   0 value b_num_zero   
   0 value b_den_zero   
   0 value b_num_Inf   
   0 value b_den_Inf   
      
   variable remainder   
      
   : _F*/ ( F: r1 r2 r3 -- ) ( -- usign uexponent udquot true )   
       \ or ( F: -- DP_NAN | +/- DP_INF )  ( -- false )   
        get-arg-components   
      
        \ Check for NaN in args   
        r1_frac @ r1_exp @ _IsNaN?   
        r2_frac @ r2_exp @ _IsNaN? or   
        r3_frac @ r3_exp @ _IsNaN? or IF DP_NAN FALSE EXIT THEN   
      
        r1_frac @ r1_exp @ _IsZero?   
        r2_frac @ r2_exp @ _IsZero? or to b_num_zero   
        r3_frac @ r3_exp @ _IsZero? to b_den_zero   
      
        b_num_zero b_den_zero and IF DP_NAN FALSE EXIT THEN   
      
        r1_sign r2_sign xor to num_sign   
      
        r1_frac @ r1_exp @ _IsInf?   
        r2_frac @ r2_exp @ _IsInf? or to b_num_Inf   
        r3_frac @ r3_exp @ _IsInf? to b_den_Inf   
      
        b_num_Inf b_den_Inf and IF DP_NAN FALSE EXIT THEN   
      
        b_num_zero b_den_Inf or IF  \ return signed zero   
          num_sign r3_sign xor   
          0   
          0 s>d TRUE EXIT   
        THEN   
      
        b_den_zero b_num_Inf or IF  \ return signed Inf   
          DP_+INF   
          num_sign r3_sign xor IF FNEGATE THEN   
          FALSE EXIT   
        THEN   
      
        num_sign r3_sign xor   
        r1_exp @ r2_exp @ + r3_exp @ -   
        r1_frac @ frac>significand   
        r2_frac @ frac>significand um*   
        r3_frac @ frac>significand ud/mod   
        rot remainder !   
        TRUE   
      
      
   variable uhigh   
      
   : F*/ ( F: r1 r2 r3 -- r )   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]