From: chris.m.thomasson.1@gmail.com   
      
   On 12/6/2025 10:07 AM, MitchAlsup wrote:   
   >   
   > scott@slp53.sl.home (Scott Lurndal) posted:   
   >   
   >> MitchAlsup writes:   
   >>>   
   >>> David Brown posted:   
   >>>   
   >>>> On 05/12/2025 18:57, MitchAlsup wrote:   
   >>>>>   
   >>>>> anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   >>>>>   
   >>>>>> David Brown writes:   
   >>>>>>> "volatile" /does/ provide guarantees - it just doesn't provide enough   
   >>>>>>> guarantees for multi-threaded coding on multi-core systems. Basically,   
   >>>>>>> it only works at the C abstract machine level - it does nothing that   
   >>>>>>> affects the hardware. So volatile writes are ordered at the C level,   
   >>>>>>> but that says nothing about how they might progress through storage   
   >>>>>>> queues, caches, inter-processor communication buses, or whatever.   
   >>>>>>   
   >>>>>> You describe in many words and not really to the point what can be   
   >>>>>> explained concisely as: "volatile says nothing about memory ordering   
   >>>>>> on hardware with weaker memory ordering than sequential consistency".   
   >>>>>> If hardware guaranteed sequential consistency, volatile would provide   
   >>>>>> guarantees that are as good on multi-core machines as on single-core   
   >>>>>> machines.   
   >>>>>>   
   >>>>>> However, for concurrent manipulations of data structures, one wants   
   >>>>>> atomic operations beyond load and store (even on single-core systems),   
   >>>>>   
   >>>>> Such as ????   
   >>>>   
   >>>> Atomic increment, compare-and-swap, locks, loads and stores of sizes   
   >>>> bigger than the maximum load/store size of the processor.   
   >>>   
   >>> My 66000 ISA can::   
   >>>   
   >>> LDM/STM can LD/ST up to 32 DWs as a single ATOMIC instruction.   
   >>> MM can MOV up to 8192 bytes as a single ATOMIC instruction.   
   >>>   
   >>> Compare Double, Swap Double::   
   >>>   
   >>> BOOLEAN DCAS( type oldp, type_t oldq,   
   >>> type *p, type_t *q,   
   >>> type newp, type newq )   
   >>> {   
   >>> type t = esmLOCKload( *p );   
   >>> type r = esmLOCKload( *q );   
   >>> if( t == oldp && r == oldq )   
   >>> {   
   >>> *p = newp;   
   >>> esmLOCKstore( *q, newq );   
   >>> return TRUE;   
   >>> }   
   >>> return FALSE;   
   >>> }   
   >>>   
   >>> Move Element from one place to another:   
   >>>   
   >>> BOOLEAN MoveElement( Element *fr, Element *to )   
   >>> {   
   >>> Element *fn = esmLOCKload( fr->next );   
   >>> Element *fp = esmLOCKload( fr->prev );   
   >>> Element *tn = esmLOCKload( to->next );   
   >>> esmLOCKprefetch( fn );   
   >>> esmLOCKprefetch( fp );   
   >>> esmLOCKprefetch( tn );   
   >>> if( !esmINTERFERENCE() )   
   >>> {   
   >>> fp->next = fn;   
   >>> fn->prev = fp;   
   >>> to->next = fr;   
   >>> tn->prev = fr;   
   >>> fr->prev = to;   
   >>> esmLOCKstore( fr->next, tn );   
   >>> return TRUE;   
   >>> }   
   >>> return FALSE;   
   >>> }   
   >>>   
   >>> So, I guess, you are not talking about what My 66000 cannot do, but   
   >>> only what other ISAs cannot do.   
   >>   
   >> In my 40 years of SMP OS/HV work, I don't recall a   
   >> situation where 'MoveElement' would be useful or   
   >> required as an hardware atomic operation.   
   >   
   > The question is not would "MoveElement" be useful, but   
   > would it be useful to have a single ATOMIC event be   
   > able to manipulate {5,6,7,8} pointers in one event ??   
   >   
   >> Individual atomic "Remove Element" and "Insert/Append Element"[*], yes.   
   >> Combined? Too inflexible.   
   >   
   > BOOLEAN InsertElement( Element *el, Element *to )   
   > {   
   > tn = esmLOCKload( to->next );   
   > esmLOCKprefetch( el );   
   > esmLOCKprefetch( tn );   
   > if( !esmINTERFERENCE() )   
   > {   
   > el->next = tn;   
   > el->prev = to;   
   > to->next = el;   
   > esmLOCKstore( tn->prev, el );   
   > return TRUE;   
   > }   
   > return FALSE;   
   > }   
   >   
   > BOOLEAN RemoveElement( Element *fr )   
   > {   
   > fn = esmLOCKload( fr->next );   
   > fp = esmLOCKload( fr->prev );   
   > esmLOCKprefetch( fn );   
   > esmLOCKprefetch( fp );   
   > if( !esmINTERFERENCE() )   
   > {   
   > fp->next = fn;   
   > fn->prev = fp;   
   > fr->prev = NULL;   
   > esmLOCKstore( fr->next, NULL );   
   > return TRUE;   
   > }   
   > return FALSE;   
   > }   
   >   
   >>   
   >> [*] For which atomic compare-and-swap or atomic swap is generally   
   sufficient.   
   >>   
   >> Atomic add/sub are useful. The other atomic math operations (min, max, etc)   
   >> may be useful in certain cases as well.   
      
   Have you ever read about KCSS?   
      
   https://groups.google.com/g/comp.arch/c/shshLdF1uqs   
      
   https://patents.google.com/patent/US7293143   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|