From: anton@mips.complang.tuwien.ac.at   
      
   MitchAlsup writes:   
   >Single rounding for 2^n FPs   
      
   Given that usual FP arithmetics does not satisfy the associative law,   
   one cannot transform easily-written reductions like   
      
   double r=0.0   
   for (...)   
    r += a[i];   
      
   into any more efficient form (e.g., one with several computation   
   strands or one that uses a complete tree evaluation). Compilers use   
   the allowance given by -ffast-math to reassociate the computations and   
   to vectorize such code. Of course, if you implement FP operations in   
   general without inexact results, your FP operations are associative,   
   but I fail to see how that can be achieved in general.   
      
   In the above, the programmer could use "accumulator" instead of   
   "double" for the type of r, indicating to the compiler that all the   
   additions are exact, and only when r is assigned to a double does the   
   rounding happen. The compiler would compile the + to an   
   add-to-accumulator instruction, and the VVM hardware, upon seeing such   
   instructions, would know that they follow the associative law and can   
   be reassociated.   
      
   However, my impression has been that you only intend to have one   
   physical accumulator, which results in the questions of how several   
   array elements per cycle could be added to this accumulator. Also, if   
   several strands of recurrences are appropriate, how is that done?   
      
   Also, what happens if the loop performs two reductions that would need   
   the accumulator?   
      
   Finally, most code will not be my66000-specific and will use the type   
   "double" in code as above; ok, this code will be compiled with   
   -ffast-math if he programmer intends it to be vectorized, but how is   
   that compiler flag encoded in the nachine code such that VVM knows   
   that it can reassociate these additions?   
      
   - anton   
   --   
   'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'   
    Mitch Alsup,    
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|