From: terje.mathisen@tmsw.no   
      
   Anton Ertl wrote:   
   > MitchAlsup writes:   
   >> Anton, I am surprised you have not heard of Quires !!! where the   
   >> accumulator is the full exponent and fraction width.   
   > ...   
   >> Also, given a Quire, the compiler becomes FREE to reorder arithmetic   
   >> terms in a reduction.   
   >   
   > Am I surprised that you did not read all of what I wrote?   
   > Unfortunately not.   
   >   
   > Anyway, maybe this time you will read it, and I will spell it out more   
   > clearly:   
   >   
   > With two 8-wide SIMD DP FP adders and three cycles of latency, you can   
   > use six strands of SIMD addition (48 strands of scalar addition) to   
   > make full use of the SIMD units.   
   >   
   > How many accumulators or (if you really need to use that term) quires   
   > do you have in your architecture and in your microarchitecture; how   
   > many DP FP additions to one accumulator/quire can be started per   
   > cycle? And how long does the next bunch of additions then have to   
   > wait before being started?   
   >   
   > If the end result cannot compete with SIMD units on the machines where   
   > the programs run, I fear that the accumulator/quire will stay a   
   > feature for a certain niche of users, while another group of users   
   > will ignore it for performance reasons.   
      
   Mitch have already returned with a unit which can do 32 float   
   aggregations/cycle, so obviously quite fast.   
      
   If I had to design such a beast, I would be very tempted to use a   
   carry-save format, i.e each of the ~1080 bits in the accumulator is   
   stored as a bit value plus a possible carry from the bit below.   
      
   This way each double or float value to be aggregated just needs to be   
   aligned (per exponent), then a per-bit full adder would require about 5   
   gates and just 2 gate delays?   
      
   This should be sufficient to handle at least 6-8 new values per cycle,   
   right?   
      
   Terje   
      
   --   
   -    
   "almost all programming can be viewed as an exercise in caching"   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|