From: robfi680@gmail.com   
      
   On 2025-10-29 2:33 p.m., MitchAlsup wrote:   
   >   
   > Robert Finch posted:   
   >   
   >> Started working on yet another CPU – Qupls4. Fixed 40-bit instructions,   
   >> 64 GPRs. GPRs may be used in pairs for 128-bit ops. Registers are named   
   >> as if there were 32 GPRs, A0 (arg 0 register is r1) and A0H (arg 0 high   
   >> is r33). Sameo for other registers. GPRs may contain either integer or   
   >> floating-point values.   
   >>   
   >> Going with a bit result vector in any GPR for compares, then a branch on   
   >> bit-set/clear for conditional branches. Might also include branch true /   
   >> false.   
   >   
   > I have both the bit-vector compare and branch, but also a compare to zero   
   > and branch as a single instruction. I suggest you should too, if for no   
   > other reason than:   
   >   
   > if( p && p->next )   
   >   
      
   Yes, I was going to have at least branch on register 0 (false) 1 (true)   
   as there is encoding room to support it. It does add more cases in the   
   branch eval, but is probably well worth it.   
   >> Using operand routing for immediate constants and an operation size for   
   >> the instruction. Constants and operation size may be specified   
   >> independently. With 40-bit instruction words, constants may be 10,50,90   
   >> or 130 bits.   
   >   
   > My 66000 allows for occasional use of 128-bit values but is designed mainly   
   > for 64-bit and smaller.   
   >   
      
   Following the same philosophy. Expecting only some use for 128-bit   
   floats. Integers can only handle 8,16,32, or 64-bits.   
      
   > With 32-bit instructions, I provide, {5, 16, 32, and 64}-bit constants.   
   >   
   > Just last week we discovered a case where HW can do a better job than SW.   
   > Previously, the compiler would emit:   
   >   
   > CVTfd Rt,Rf   
   > FMUL Rt,Rt,#1.425D0   
   > CVTdf Rd,Rt   
   >   
   > Which is subject to double rounding once at the FMUL and again at the   
   > down conversion. I though about the problem and it seems fairly easy   
   > to gate the 24-bit fraction into the multiplier tree along with the   
   > 53-bit fraction of the constant, and then normalize and round the   
   > result dropping out of the tree--avoiding the double rounding case.   
   >   
   > Now, the compiler emits:   
   >   
   > FMULf Rd,Rf,#1.425D0   
   >   
   > saving 2 instructions alongwith the higher precision.   
      
   Improves the accuracy? of algorithms, but seems a bit specific to me.   
   Are there other instruction sequence where double-rounding would be good   
   to avoid? Seems like HW could detect the sequence and fuse the instructions.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|