From: user5857@newsgrouper.org.invalid   
      
   EricP posted:   
      
   > John Savard wrote:   
   > > On Sun, 21 Dec 2025 20:32:44 +0000, MitchAlsup wrote:   
   > >>> On Thu, 18 Dec 2025 21:29:00 +0000, MitchAlsup wrote:   
   > >   
   > >>>> Or in other words, if you can decode K-instructions per cycle, you'd   
   > >>>> better be able to execute K-instructions per cycle--or you have a   
   > >>>> serious blockage in your pipeline.   
   > >   
   > >> Not a typo--the part of the pipeline which is narrowest is   
   > >> the part that limits performance. I suggest strongly that you should not   
   > >> make/allow the decoder to play that part.   
   > >   
   > > I agree - and strongly, too - that the decoder ought not to be the part   
   > > that limits performance.   
   > >   
   > > But what I quoted says that the execution unit ought not to be the part   
   > > that limits performance, with the implication that it's OK if the decoder   
   > > does instead. That's why I said it must be a typo.   
   > >   
   > > So I think you need to look a second time at what you wrote; it's natural   
   > > for people to see what they expect to see, and so I think you looked at   
   > > it, and didn't see the typo that was there.   
   > >   
   > > John Savard   
   >   
   > There are two kinds of stalls:   
   > stalls in the serial front end I-cache, Fetch or Decode stages because   
   > of *too little work* (starvation due to input latency),   
   > and stalls in the back end Execute or Writeback stages because   
   > of *too much work* (resource exhaustion).   
      
   DECODE latency increases when:   
   a) there is no instruction(s) to decode   
   b) there is no address from which to fetch   
   c) when there is no translation of the fetch address   
      
   a) is a cache miss   
   b) is an indirect control transfer   
   c) is a TLB miss   
      
   And there may be additional cases of instruction buffer hiccups.   
      
   > The front end stalls inject bubbles into the pipeline,   
   > whereas back end stalls can allow younger bubbles to be compressed out.   
      
   How In-Order your thinking is. GBOoO machine do not inject bubbles.   
      
   > If I have to stall, I want it in the back end.   
      
   If I have to stall I want it based on "realized" latency.   
      
   > It has to do with catching up after a stall.   
      
   Which is why you do not inject bubbles...   
      
   > If a core stalls for 3 clocks, then in order to average 1 IPC   
   > it must retire 2 instructions per clock for the next 3 clocks.   
   > And it can only do that if it has a backlog of work ready to execute.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|