Gecko/20101207 Thunderbird/3.1.7
comp.lang.pascal.misc:164
From: Robert AH Prins
On 2011-01-16 00:37, Robert AH Prins wrote:
> On 2011-01-15 16:54, Wolfgang Ehrhardt wrote:
>> Virtual Pascal and Borland Pascal with Robert Prins' bpl70v20 have an
>> enhanced exp routine based on a code snippet from Norbert Juffa. This
>> is more accurate than Borland's standard Pascal/Delphi exp or older VP
>> routines.
>>
>> The key idea is to calculate exp(x) = 2^(log2(e)*x - 1) + 1 with extra
>> precision for frac(log2(e)*x) and then use the x87 instruction f2xm1
>> to compute 2^frac(log2(e)*x)-1. f2xm1 requires that the abs value of
>> its argument must be<= 1.
>>
>> Unfortunately the Juffa's code miserably fails for some x values if
>> the "round up toward +infinity" is used. Example:
>>
>> program t_expbug;
>> {$N+}
>> const
>> rmUp: word = $1B72;
>> var
>> x,y: extended;
>> begin
>> asm
>> fldcw [rmUp]
>> end;
>> x := 20.0 + 0.7944154167983592825;
>> y := exp(x);
>> writeln(x:18:14, y:27, 1073741824.0:27);
>> x := 2*x;
>> y := exp(x);
>> writeln(x:18:14, y:27, 1152921504606846976.0:27);
>> end.
>>
>> The output is for both Virtual Pascal 2.1.279 and BP7/bpl70v20:
>>
>> 20.79441541679836 -2.32830643653869629E-10 1.07374182400000000E+09
>> 41.58883083359672 -5.00000000000000000E-01 1.15292150460684698E+18
>>
>> A fix is to test frac() against 1 and correct. I guess this is
>> faster than temporarily change the rounding control to "round to
>> nearest".
>>
>> Here is a corrected version (suppressing most comments in order to
>> display properly in newsreaders)
>>
>> const
>> ln2_hi: array[0..7] of byte = ($00,$00,$E0,$FE,$42,$2E,$E6,$3F);
>> ln2_lo: array[0..7] of byte = ($76,$3C,$79,$35,$EF,$39,$EA,$3D);
>>
>> function exp(x: extended): extended;assembler;{&Frame-}{&Uses none}
>> {-Accurate and debugged exp}
>> asm
>> {Based on Norbert Juffa's exp}
>> fld [x]
>> fldl2e
>> fmul st,st(1)
>> frndint
>> fld qword ptr [ln2_hi]
>> fmul st,st(1)
>> fsubp st(2),st
>> fld qword ptr [ln2_lo]
>> fmul st, st(1)
>> fsubp st(2),st
>> fxch st(1)
>> fldl2e
>> fmulp st(1),st
>>
>> {It may happen (especially for rounding modes other than
>> "round to nearest") that |frac(z)|> 1. In this case the
>> result of f2xm1 is undefined. The next lines will test
>> frac(z) and truncate it to +1 or -1 if necessary.}
>>
>> fld st
>> fabs
>> fld1
>> fcompp
>> fstsw ax
>> sahf
>> jae @@1
>> fldz
>> fcompp
>> sahf
>> fld1
>> jae @@1
>> fchs
>> @@1:
>>
>> f2xm1
>> fld1
>> faddp st(1),st
>> fscale
>> fstp st(1)
>> fwait
>> end;
>>
>>
>> Any comments or improvements?
>
> I've forwarded your code to the grandmaster of everything FPU, let's
> wait and see.
His initial reply:
Robert:
Math library functions targetting x87 were typically designed with the
assumption that the dynamic rounding control of the x87 FPU at entry is
set to round-to-nearest-or-even. That is certainly the specification to
which I designed this code, so the results observed are a consequence of
operating my code outside its design specifications.
Does documentation for Borland Pascal (or Virtual Pascal) state anywhere
that common math library routines are guaranteed to work correctly if a
user changes the x87 rounding mode away from the default setting, prior
to invoking such a function? If so, what is the stated definition of the
correct behavior in such a case?
For example, the specification may state that the result returned must
be the same for all rounding modes, which may require each function to
use a wrapper that saves the current rounding mode, sets it to round to
nearest, then restores it at the end. Obviously this would be slow.
I will take a quick look to see if there is a small modification to the
code that gets results closer to what the users expects, but no promises.
I'll get back to you by Sunday night.
-- Norbert
--
Robert AH Prins
spamtrap(a)prino(d)org
--- Internet Rex 2.31
* Origin: The gateway at Omicron Theta (1:261/20.999)
|