From: user5857@newsgrouper.org.invalid   
      
   "Chris M. Thomasson" posted:   
      
   > On 12/13/2025 11:12 AM, MitchAlsup wrote:   
   > >   
   > > anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   > >   
   > >> MitchAlsup writes:   
   > >>> What my solution entails is a modification   
   > >>> to the cache coherence model (NaK) that indicates "Yes I have the line   
   you   
   > >>> referenced, but, no you can't have it right now" in order to strengthen   
   > >>> the guarantees of forward progress.   
   > >>   
   > >> How does it strengthen the guarantees of forward progress?   
   > >   
   > > The allowance of a NaK is only available under somewhat special   
   > > circumstances::   
   > > a) in Careful mode:: when core can see that all STs have write permission   
   > > and data is present, NaKs allow the Modification part to run to   
   > > completion.   
   > > b) In Slow and Methodical mode:: core can NaK any access to any of its   
   > > cache lines--preventing interference.   
   > >   
   > >> My guess:   
   > >> If the requester itself is in an atomic sequence B, it will cancel it.   
   > >   
   > > Yes, the "other guy" takes the hit not the guy who has made more forward   
   > > progress. If B was an innocent accessor of the data, it retires its   
   > > request--this generally takes 100-odd cycles, allowing A to complete   
   > > the event by the time the innocent request shows up again.   
   > >   
   > >> This could help if the atomic sequence A that caused the NaK then   
   > >> tries to get a cache line that would be kept by B.   
   > >>   
   > >> There is still a chance of both sequences canceling each other by   
   > >> sending NaKs at the same time, but it is smaller and with something   
   > >> like exponential backoff eventual forward progress could be achieved.   
   > >   
   > > Instead of some contrived back-off policy--at the failure point one can   
   > > read the WHY register. 0 indicates success; negative indicates spurious,   
   > > positive indicates how far down the line of requestors YOU happen to be.   
   > > So, if you are going after a unit of work, you march down the queue WHY   
   > > units and then YOU are guaranteed that YOU are the only one after that   
   > > unit of work.   
   >   
   > Step one. Make sure that a failure means another thread made progress.   
   > strong CAS does this. Don't let it spuriously fail where nothing makes   
   > progress... ;^o   
      
   Absollutely!   
      
   WHY is only valid in "slow and methodological" which has strong guarantees   
   of forward progress--at least 1 thread is making forward progress in S&M.   
      
   Spurious has to do with things like "system arbiter buffer overflow" and   
   is not related to exceptions or interrupts.   
      
   > Oh my we got a load on the reservation granule, abort all LL/SC in   
   > progress wrt that granule. Of course this assumes that the user that   
   > created the program for it gets things right.   
      
   This is why I created NaK in the cache coherence protocol--to strengthen   
   the guarantee of forward progress.   
      
   > For a LL/SC on the PPC it   
   > definitely helps where things are aligned and padded up to a reservation   
   > granule, not just a l2 cache line. Helps mitigate false sharing causing   
   > livelock.   
   >   
   > Even in weak CAS, akin to LL/SC. Well, how sensitive is that reservation   
   > granule. Can a simple load cause a failure?   
      
   Innocent LD gets NaKed causing the innocent thread to waste time while   
   allowing the ATOMIC event to make forward progress.   
      
   In my case reservation granule is a cache line {which is the same across   
   the memory hierarchy--but still allows for implementation defined size}.   
      
   For example:: HBM can deliver 1024-bits (soon 2048-bits) in a single beat,   
   so, for main_memory == HBM it makes sense to align the size of the LLcache   
   to the width of HBM. Once in LLC, you can parcel it out any way your system   
   prescribes.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|