... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 130,461 of 131,241
Chris M. Thomasson to MitchAlsup
Re: Memory ordering (Re: Multi-precision
07 Dec 25 15:09:15
   From: chris.m.thomasson.1@gmail.com   
      
   On 12/6/2025 9:22 AM, MitchAlsup wrote:   
   >   
   > "Chris M. Thomasson"  posted:   
   >   
   >> On 12/5/2025 12:54 PM, MitchAlsup wrote:   
   >>>   
   >>> David Brown  posted:   
   >>>   
   >>>> On 05/12/2025 18:57, MitchAlsup wrote:   
   >>>>>   
   >>>>> anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   >>>>>   
   >>>>>> David Brown  writes:   
   >>>>>>> "volatile" /does/ provide guarantees - it just doesn't provide enough   
   >>>>>>> guarantees for multi-threaded coding on multi-core systems.  Basically,   
   >>>>>>> it only works at the C abstract machine level - it does nothing that   
   >>>>>>> affects the hardware.  So volatile writes are ordered at the C level,   
   >>>>>>> but that says nothing about how they might progress through storage   
   >>>>>>> queues, caches, inter-processor communication buses, or whatever.   
   >>>>>>   
   >>>>>> You describe in many words and not really to the point what can be   
   >>>>>> explained concisely as: "volatile says nothing about memory ordering   
   >>>>>> on hardware with weaker memory ordering than sequential consistency".   
   >>>>>> If hardware guaranteed sequential consistency, volatile would provide   
   >>>>>> guarantees that are as good on multi-core machines as on single-core   
   >>>>>> machines.   
   >>>>>>   
   >>>>>> However, for concurrent manipulations of data structures, one wants   
   >>>>>> atomic operations beyond load and store (even on single-core systems),   
   >>>>>   
   >>>>> Such as ????   
   >>>>   
   >>>> Atomic increment, compare-and-swap, locks, loads and stores of sizes   
   >>>> bigger than the maximum load/store size of the processor.   
   >>>   
   >>> My 66000 ISA can::   
   >>>   
   >>> LDM/STM can LD/ST up to 32   DWs   as a single ATOMIC instruction.   
   >>> MM      can MOV   up to 8192 bytes as a single ATOMIC instruction.   
   >>>   
   >>> Compare Double, Swap Double::   
   >>>   
   >>> BOOLEAN DCAS( type oldp, type_t oldq,   
   >>>                 type *p,   type_t *q,   
   >>>                 type newp, type newq )   
   >>> {   
   >>>        type t = esmLOCKload( *p );   
   >>>        type r = esmLOCKload( *q );   
   >>>        if( t == oldp && r == oldq )   
   >>>        {   
   >>>                           *p = newp;   
   >>>             esmLOCKstore( *q,  newq );   
   >>>             return TRUE;   
   >>>        }   
   >>>        return FALSE;   
   >>> }   
   >>>   
   >>> Move Element from one place to another:   
   >>>   
   >>> BOOLEAN MoveElement( Element *fr, Element *to )   
   >>> {   
   >>>        Element *fn = esmLOCKload( fr->next );   
   >>>        Element *fp = esmLOCKload( fr->prev );   
   >>>        Element *tn = esmLOCKload( to->next );   
   >>>        esmLOCKprefetch( fn );   
   >>>        esmLOCKprefetch( fp );   
   >>>        esmLOCKprefetch( tn );   
   >>>        if( !esmINTERFERENCE() )   
   >>>        {   
   >>>                      fp->next = fn;   
   >>>                      fn->prev = fp;   
   >>>                      to->next = fr;   
   >>>                      tn->prev = fr;   
   >>>                      fr->prev = to;   
   >>>        esmLOCKstore( fr->next,  tn );   
   >>>                      return TRUE;   
   >>>        }   
   >>>        return FALSE;   
   >>> }   
   >>>   
   >>> So, I guess, you are not talking about what My 66000 cannot do, but   
   >>> only what other ISAs cannot do.   
   >>   
   >> Any issues with live lock in here?   
   >   
   > A bit hard to tell because of 2 things::   
   > a) I carry around the thread priority and when interference occurs,   
   >     the higher priority thread wins--ties the already accessed thread wins.   
   > b) live-lock is resolved or not by the caller to these routines, not   
   >     these routines themselves.   
      
   Hummm... Iirc, I was able to cause damage to a strong CAS. It was around   
   20 years ago. A thread was running strong CAS in a tight loop. I counted   
   success vs failure. Then allowed some other threads that altered the   
   target word with random data. The failure rate for the CAS increased.   
   Actually, I think cmpxchg, cmpxchg8b, cmpxchg16b, and the strange one on   
   Itanium. Cannot remember it right now. cmp8xchg16? Or some shit.   
      
   Well, they would hit a bus lock if they failed too many times. I think   
   Scott knows about it.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]