From: user5857@newsgrouper.org.invalid   
      
   Robert Finch posted:   
      
   > On 2025-11-26 7:08 p.m., MitchAlsup wrote:   
   > >   
   > > Robert Finch posted:   
   > >   
   > >> On 2025-11-26 3:57 p.m., MitchAlsup wrote:   
   > >>>   
   > >>> Robert Finch posted:   
   > >>>   
   > >>>>> In this case, put the cause in a container the instruction drags down   
   > >>>>> the pipe, and retrieve it when you do have address access to where it   
   > >>>>> needs to go.   
   > >>>>   
   > >>>> I may change things to pass the address around in the float package.   
   > >>>> Putting the address into the NaN later may cause issues with timing. It   
   > >>>> adds a mux into things. May be better to use the original NaN mux in the   
   > >>>> float modules. May call it a NaN identity field instead of an address.   
   > >>>   
   > >>> For example: when a My 66000 instruction needs to raise an exception   
   > >>> the Inst *I argument contains a field I->raised which is set (1< >>> and at the end of the pipe (at retire), t->raised |= I->raised. Where   
   > >>> we have a *t there is also t->ip. So, you don't have to drag Thread *t   
   > >>> through all the subroutine calls, but you can easily access t->raised   
   > >>> at the point you do have access to t->ip.   
   > >>>   
   > >> Had trouble reading that, sounds like goobly-goop. But I believe I   
   > >> figured it out.   
   > >>   
   > >> Sounds like the address is inserted at the end of the pipe which I am   
   > >> sure is not the case.   
   > >>   
   > >> I figured this out: the NaN address must be embedded in the result by   
   > >> the time the result updates the bypass network and registers so that it   
   > >> is available to other instructions.   
   > >>   
   > >> The address is available at the start of the calc from the reservation   
   > >> station entry. Me thinks it must be embedded when the NaN result status   
   > >> is set, provided there is not already a NaN. The existing (first) NaN   
   > >> must propagate through.   
   > >   
   > > See last calculation line in the following::   
   > >   
   > > void RunInst( Chip *chip )   
   > > {   
   > > for( uint64_t i = 0; i < chip->cores; i++ )   
   > > {   
   > > ContextStack *cpu = &core[i];   
   > > uint8_t cs = cpu->cs;   
   > > Thread *t;   
   > > Inst *I;   
   > > uint16_t raised;   
   > >   
   > > if( cpu->interrupt.raised & ((((signed)1)<<63) >>   
   cpu->priority) )   
   > > { // take an interrupt   
   > > cpu->cs = cpu->interrupt.cs;   
   > > cpu->priority = cpu->interrupt.priority;   
   > > t = context[cpu->cs];   
   > > t->reg[0] = cpu->interrupt.message;   
   > > }   
   > > else if( raised = t->raised & t->enabled )   
   > > { // take an exception   
   > > cpu->cs--;   
   > > t = context[cpu->cs];   
   > > t->reg[0] = FT1( raised ) | EXCPT;   
   > > t->reg[1] = I->inst;   
   > > t->reg[2] = I->src1;   
   > > t->reg[3] = I->src2;   
   > > t->reg[4] = I->src3;   
   > > }   
   > > else   
   > > { // run an instruction   
   > > t = context[cpu->cs];   
   > > memory( FETCH, t->ip, &I->inst );   
   > > t->ip += 4;   
   > > majorTable[ I->inst.major ]( t, I );   
   > > t->raised |= I->raised; // propagate raised here   
   > > }   
   > > }   
   > > }   
   >   
   > That looks like code for a simulator.   
      
   It is (IS) code for a non-timing simulator {a "right answer" simulator   
   if you please.}   
      
   > How closely does it follow the   
   > operation of the CPU?   
      
   CPUs have a pipeline, I is the quantity that gets dragged down the   
   pipe, *t is the control registers of that CPU.   
      
   > I do not see where 'I' is initialized.   
      
   Call to memory(). Then as I gets dragged down the pipeline, more   
   fields are initialized. I drag the whole structure mostly for   
   debug purposes.   
      
   > It has been a while since I worked on simulator code.   
   >   
   > The IP value is just muxed in in a five to one mux for the significand.   
   > Had to account for NaN's infinities and overflow anyway. Address gets   
   > propagated with some some flops, but flops are inexpensive in an FPGA.   
   >   
   > always_comb   
   > casez({aNan5,bNan5,qNaNOutab5,aInf5,bInf5,overab5})   
   > 6'b1?????: moab6 <=   
   > {1'b1,1'b1,a5[fp64Pkg::FMSB-1:0],{fp64Pkg::FMSB+1{1'b0}}};   
   > 6'b01????: moab6 <=   
   > {1'b1,1'b1,b5[fp64Pkg::FMSB-1:0],{fp64Pkg::FMSB+1{1'b0}}};   
   > 6'b001???: moab6 <= {1'b1,qNaN|(64'd4 <<   
   > (fp64Pkg::FMSB-4))|adr5[63:16],{fp64Pkg::FMSB+1{1'b0}}}; // multiply inf   
   > * zero   
   > 6'b0001??: moab6 <= 0; // mul inf's   
   > 6'b00001?: moab6 <= 0; // mul inf's   
   > 6'b000001: moab6 <= 0; // mul overflow   
   > default: moab6 <= fractab5;   
   > endcase   
   >   
   >   
   > >>   
   > >>>> Modified NaN support in the float package to store to the HOBs.   
   > >>>>   
   > >>>> Survey says:   
   > >>>>   
   > >>>> The Qulps PUSH and POP instructions have room for six register fields.   
   > >>>> Should one of the fields be used to identify the stack pointer register   
   > >>>> allowing five registers to be pushed or popped? Or should the stack   
   > >>>> pointer register be assumed so that six registers may be pushed or   
   popped?   
   > >>>   
   > >>> My 66000 ENTER and EXIT instruction use SP == R31 implicitly. But,   
   > >>> instead of giving it a number of registers, there is a start register   
   > >>> and a stop register, so 1-to-32 regsiters can be saved/restored. The   
   > >>> immediate contains how much stack space to allocate/deallocate.   
   > >>>   
   > >>> {{when Safe-Stack is enabled:: Rstart-to-R0 are placed on the   
   inaccessible   
   > >>> stack, while R1-to-Rstop are placed on the normal stack.}}   
   > >>>   
   > >>> Because the stack is always DoubleWord aligned, the 3-LoBs of the   
   > >>> immediate are used to indicate "special" activities on a couple of   
   > >>> registers {R0, R31, R30}, R31 is rarely saves and reloaded from Stack   
   > >>> but just returned to its previous value by integer arithmetic. FP can   
   > >>> be updated or it can be treated like "just another register". R0 can   
   > >>> be loaded directly to t->ip, or loaded into R0 for stack walk-backs.   
   > >>>   
   > >>> The corresponding LDM and STM are seldom used.   
   > >>>   
   > >> I ran out of micro-ops for ENTER and EXIT, so they only save the LR and   
   > >> FP (on the safe stack). A separate PUSH/POP on safe stack instruction is   
   > >> used.   
   > >>   
   > >> I figured LDM and STM are not used often enough. PUSH / POP is used in   
   > >> many places LDM / STM might be.   
   > >   
   > > Its a fine line.   
   > >   
   > > I found more uses for an instruction that moves a number of registers   
   > > randomly allocated to fixed positions (arguments to a call) than to   
   > > move random string of registers to/from memory.   
   > >   
   > > .   
   > > MOV R1,R10   
   > > MOV R2,R25   
   > > MOV R3,R17   
   > > CALL Subroutine   
   > > . ; deal with any result   
   > >   
   >   
   > My 66000 has an instruction to do that?   
      
   No, but the thought that it could be profitable to have such an   
      
   [continued in next message]   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|