Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.forth    |    Forth programmers eat a lot of Bratwurst    |    117,927 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 116,517 of 117,927    |
|    Krishna Myneni to Krishna Myneni    |
|    Re: F*/ (f-star-slash) (1/2)    |
|    31 May 24 22:57:18    |
      From: krishna.myneni@ccreweb.org              On 5/29/24 07:27, Krishna Myneni wrote:       > On 5/28/24 06:11, Krishna Myneni wrote:       > ...       ...       > Now the benchmark code runs. It shows two things (see below):       >       > 1. The execution time is poor compared to Anton's implementation, about       > 8x slower.       >       > 2. Half of the arithmetic results are wrong, with the max relative error       > being as high as 2/3.       >       > So, I clearly have some work to do to fix the arithmetic. It was       > fortuitous that it passed my test cases, but clearly the test cases in       > my original implementation of F*/ were insufficient.       >       > --       > Krishna       >       >       > bench-f*/d       > 603 ms       > ok       > check-results       >       > 1000000 samples       >       > # tolerance errors: 501374       > # sign errors: 0       > Max relative error: 0.666667       > ok       >              Ok. I found and fixed the errors in my Forth implementation of F*/ . It       runs the benchmark test and now there are no arithmetic errors to within       the specified tolerance of 1E-17.              My Forth implementation is much slower than Anton's, which uses the FPU       floating point operations. However, my Forth implementation of F*/ makes       no use of floating point arithmetic, and will likely be considerably       faster in assembly code than in Forth due to the low level bit operations.              There is also a restriction currently on passing subnormal numbers to my       version of F*/ -- as it stands, it will give incorrect results for       subnormal numbers and won't detect this condition.              I also found an error in the benchmark code (bench-fstarslash.4th), in       the word,              RAND-EXP ( -- e ) \ Return signed binary exponent in normal range              It was allowing return of biased binary exponent of -1023 which is not       in the normal range. Anton's implementation works with subnormal       numbers, so it passed the results verification check.              The updated definition of the random exponent generator for       bench-fstarslash.4th is              \ Return signed binary exponent in normal range       : rand-exp ( -- e )        random2 RAND_EXP_MASK and        1 max DP_EXPONENT_MAX 1- min \ ensure e within normal range        DP_EXPONENT_BIAS - ;                     My current implementation, which passes the results verification test in       bench-fstarslash.4th (with the above revised def of RAND-EXP), is listed       below, along with a sample run.              Cheers,       Krishna              --              === begin fstarslash.4th ===       \ fstarslash.4th       \       \ Implementation of F*/ on 64-bit Forth       \       \ Krishna Myneni, 2024-05-19       \ Revised: 2024-05-31       \       \ F*/ ( F: r1 r2 r3 -- r )       \ Multiply two IEEE-754 double precision floating point       \ numbers r1 and r2 and divide by r3 to obtain the result,       \ r. The intermediate product significand, r1*r2, is       \ represented with at least IEEE quad precision for both       \ its significand and exponent.       \       \ F*/ uses an intermediate significand represented by       \ 128 bits       \       \ Notes:       \ 1. Does not work with subnormal IEEE 754 double-       \ precision numbers. Needs tests for subnormal       \ inputs.              \ include ans-words              1 cells 8 < [IF]        cr .( Requires 64-bit Forth system ) cr        ABORT       [THEN]              fvariable temp       : >fpstack ( u -- ) ( F: -- r )        temp ! temp f@ ;              base @ hex       8000000000000000 constant DP_SIGNBIT_MASK       000FFFFFFFFFFFFF constant DP_FRACTION_MASK       7FF0000000000000 constant DP_EXPONENT_MASK       \ DP special value constants       7FF0000000000001 >fpstack fconstant DP_NAN       FFF0000000000000 >fpstack fconstant DP_-INF       7FF0000000000000 >fpstack fconstant DP_+INF       base !               52 constant DP_FRACTION_BITS       1023 constant DP_EXPONENT_BIAS       2047 constant DP_EXPONENT_MAX              1 DP_FRACTION_BITS LSHIFT constant DP_IMPLIED_BIT       63 constant DP_SIGN_BIT              : dp-fraction ( F: r -- r ) ( -- u )        FP@ @ DP_FRACTION_MASK and ;       : dp-exponent ( F: r -- r ) ( -- e )        FP@ @ DP_EXPONENT_MASK and DP_FRACTION_BITS rshift ;       : dp-sign ( F: r -- r ) ( -- sign )        FP@ @ DP_SIGN_BIT rshift ;              : _isZero? ( ufrac uexp -- b ) 0= swap 0= and ;       : _isInf? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0= and ;       : _isNaN? ( ufrac uexp -- b ) DP_EXPONENT_MAX = swap 0<> and ;       : _isSubNormal? ( ufrac uexp -- b ) 0= swap 0<> and ;              : frac>significand ( u -- u' ) DP_IMPLIED_BIT or ;              \ Show the significand, unbiased base-2 exponent, and sign bit       \ for a double-precision floating point number. To recover the       \ dp floating point number, r, from these fields,       \       \ r = (-1)^sign * significand * 2^(exponent - 52)       \       : dp-components. ( F: r -- )        dp-sign        dp-exponent        dp-fraction        2dup D0= invert IF frac>significand THEN        swap DP_EXPONENT_BIAS - swap        fdrop        ." significand = " 18 u.r 2 spaces        ." exponent = " 4 .r 2 spaces        ." sign = " .                     : dp-normal-significand ( u -- expinc u' )        dup DP_FRACTION_BITS 1+ rshift        dup IF        tuck rshift        ELSE        drop 0 swap \ 0 u        BEGIN        dup DP_IMPLIED_BIT and 0=        WHILE        1 lshift        swap 1- swap        REPEAT        THEN                      0 value r1_sign        0 value r2_sign        0 value r3_sign        0 value num_sign       variable r1_exp       variable r2_exp       variable r3_exp       variable r_exp       variable r1_frac       variable r2_frac       variable r3_frac              : get-arg-components ( F: r1 r2 r3 -- )        dp-fraction r3_frac !        dp-exponent r3_exp !        dp-sign to r3_sign        fdrop        dp-fraction r2_frac !        dp-exponent r2_exp !        dp-sign to r2_sign        fdrop        dp-fraction r1_frac !        dp-exponent r1_exp !        dp-sign to r1_sign        fdrop ;              0 value b_num_zero       0 value b_den_zero       0 value b_num_Inf       0 value b_den_Inf              variable remainder              : _F*/ ( F: r1 r2 r3 -- ) ( -- usign uexponent udquot true )        \ or ( F: -- DP_NAN | +/- DP_INF ) ( -- false )        get-arg-components               \ Check for NaN in args        r1_frac @ r1_exp @ _IsNaN?        r2_frac @ r2_exp @ _IsNaN? or        r3_frac @ r3_exp @ _IsNaN? or IF DP_NAN FALSE EXIT THEN               r1_frac @ r1_exp @ _IsZero?        r2_frac @ r2_exp @ _IsZero? or to b_num_zero        r3_frac @ r3_exp @ _IsZero? to b_den_zero               b_num_zero b_den_zero and IF DP_NAN FALSE EXIT THEN               r1_sign r2_sign xor to num_sign               r1_frac @ r1_exp @ _IsInf?        r2_frac @ r2_exp @ _IsInf? or to b_num_Inf        r3_frac @ r3_exp @ _IsInf? to b_den_Inf               b_num_Inf b_den_Inf and IF DP_NAN FALSE EXIT THEN               b_num_zero b_den_Inf or IF \ return signed zero        num_sign r3_sign xor        0        0 s>d TRUE EXIT        THEN               b_den_zero b_num_Inf or IF \ return signed Inf        DP_+INF        num_sign r3_sign xor IF FNEGATE THEN        FALSE EXIT        THEN               num_sign r3_sign xor        r1_exp @ r2_exp @ + r3_exp @ -        r1_frac @ frac>significand        r2_frac @ frac>significand um*        r3_frac @ frac>significand ud/mod        rot remainder !        TRUE                     variable uhigh              : F*/ ( F: r1 r2 r3 -- r )              [continued in next message]              --- SoupGate-Win32 v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca