... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 129,489 of 131,241
BGB to Waldek Hebisch
Re: What I did on my summer vacation
24 Aug 25 12:08:06
   From: cr88192@gmail.com   
      
   On 8/24/2025 8:41 AM, Waldek Hebisch wrote:   
   > Terje Mathisen  wrote:   
   >>>> This would have simplified all sorts of array/matrix sw where both   
   >>>> errors (NaN) and missing (None) items are possible.   
   >>>   
   >>> In what ways would None behave differently from SNaN?   
   >>   
   >> It would be transparently ignored in reductions, with zero overhead.   
   >   
   > In matrix calculations I simply padded matrices with zeros.   
   >   
      
   Yes, this is fairly standard.   
      
      
   But, yeah, in most normal uses, apart from edge cases little outside of   
   the normal range tends to be used all that much in practice.   
      
      
      
   Subnormal numbers exist, but are usually infrequent.   
   NaN and Inf rarely appear outside of error conditions.   
      
   NaN could make sense for things like uninitialized values, except:   
      Languages like Java use 0.0 here;   
      Languages like C and C++ give garbage is already here.   
        And, if you malloc something, it is some mix of 0s and garbage.   
      
   NaN is sometimes used as a value encoding scheme in dynamically typed   
   languages, but this is independent of what the FPU does with NaN.   
      
      
      
   Looking around, it seems that some compilers and targets use or specify   
   use of DAZ/FTZ as the default.   
      Eg: ICC and apparently the Apple OS's default to DAZ/FTZ on ARM.   
      GCC apparently enables it if compiling with "-ffast-math";   
      ...   
      
   And, on the other side, apparently lots of (not exactly hobbyist grade)   
   CPUs still handle subnormal numbers in firmware or using traps (hence   
   why anyone has reason to care). If it was nearly free in terms of   
   performance on mainline CPUs, no one would have reason to care; and,   
   people have reason to care, because the hidden traps are slow.   
      
      
   Looking around, it would appear that at least my handling of FP8 and   
   similar (FP8S, FP8U, and FP8A/A-Law) is similar to DEC floating point.   
      
   Apparently, DEC (PDP-11) used a scheme where:   
      All zeroes was understood as 0, everything else was normal range.   
      Overflows saturated at the maximum value;   
      The bias was 128 rather than 127;   
      Double was basically just float with a larger mantissa.   
      
   This differs from, say:   
      Entire 0 exponent range understood as 0;   
      Inf/NaN range still exists, but the behavior may differ.   
      
   Apparently ARM had used a DEC like approach for Binary16/Half support,   
   vs handling it like the other IEEE types.   
      
   It is a tradeoff, as having values between 65536.0 and 131008.0 could be   
   nice. As-is, maximum value is 65504.0, as the next value up is Inf. I   
   had gone with more IEEE like handling of Binary16.   
      
   ...   
      
      
   Some other variants still have denormals for FP8 variants, but leave off   
   Inf/NaN in favor of slightly more dynamic range (eg, NVIDIA).   
      
   I had made the slightly non-standard feature of (sometimes) using -0 as   
   a NaN placeholder. Ironically, this is partly because -0 seems to be   
   rare in practice for other reasons. It is possible -0 as NaN could be   
   used more, at least allowing for some level of error detection.   
      
   In this case, it is possible that, for converters:   
      Everything other than +/- 0 are normal range;   
      0 is a special case, mapping to 0 on widening;   
      -0 maps to NaN on widening.   
      Both Inf and NaN map to -0/NaN on narrowing;   
      Overflow still clamps to 0x7F/0xFF on narrowing.   
      
   Or, basically leaving the NaN scenario for "something has gone wrong on   
   the Binary16 side of things".   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]