... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 130,463 of 131,241
Chris M. Thomasson to MitchAlsup
Re: Memory ordering (Re: Multi-precision
07 Dec 25 16:36:59
   From: chris.m.thomasson.1@gmail.com   
      
   On 12/6/2025 10:07 AM, MitchAlsup wrote:   
   >   
   > scott@slp53.sl.home (Scott Lurndal) posted:   
   >   
   >> MitchAlsup  writes:   
   >>>   
   >>> David Brown  posted:   
   >>>   
   >>>> On 05/12/2025 18:57, MitchAlsup wrote:   
   >>>>>   
   >>>>> anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   >>>>>   
   >>>>>> David Brown  writes:   
   >>>>>>> "volatile" /does/ provide guarantees - it just doesn't provide enough   
   >>>>>>> guarantees for multi-threaded coding on multi-core systems.  Basically,   
   >>>>>>> it only works at the C abstract machine level - it does nothing that   
   >>>>>>> affects the hardware.  So volatile writes are ordered at the C level,   
   >>>>>>> but that says nothing about how they might progress through storage   
   >>>>>>> queues, caches, inter-processor communication buses, or whatever.   
   >>>>>>   
   >>>>>> You describe in many words and not really to the point what can be   
   >>>>>> explained concisely as: "volatile says nothing about memory ordering   
   >>>>>> on hardware with weaker memory ordering than sequential consistency".   
   >>>>>> If hardware guaranteed sequential consistency, volatile would provide   
   >>>>>> guarantees that are as good on multi-core machines as on single-core   
   >>>>>> machines.   
   >>>>>>   
   >>>>>> However, for concurrent manipulations of data structures, one wants   
   >>>>>> atomic operations beyond load and store (even on single-core systems),   
   >>>>>   
   >>>>> Such as ????   
   >>>>   
   >>>> Atomic increment, compare-and-swap, locks, loads and stores of sizes   
   >>>> bigger than the maximum load/store size of the processor.   
   >>>   
   >>> My 66000 ISA can::   
   >>>   
   >>> LDM/STM can LD/ST up to 32   DWs   as a single ATOMIC instruction.   
   >>> MM      can MOV   up to 8192 bytes as a single ATOMIC instruction.   
   >>>   
   >>> Compare Double, Swap Double::   
   >>>   
   >>> BOOLEAN DCAS( type oldp, type_t oldq,   
   >>>               type *p,   type_t *q,   
   >>>               type newp, type newq )   
   >>> {   
   >>>      type t = esmLOCKload( *p );   
   >>>      type r = esmLOCKload( *q );   
   >>>      if( t == oldp && r == oldq )   
   >>>      {   
   >>>                         *p = newp;   
   >>>           esmLOCKstore( *q,  newq );   
   >>>           return TRUE;   
   >>>      }   
   >>>      return FALSE;   
   >>> }   
   >>>   
   >>> Move Element from one place to another:   
   >>>   
   >>> BOOLEAN MoveElement( Element *fr, Element *to )   
   >>> {   
   >>>      Element *fn = esmLOCKload( fr->next );   
   >>>      Element *fp = esmLOCKload( fr->prev );   
   >>>      Element *tn = esmLOCKload( to->next );   
   >>>      esmLOCKprefetch( fn );   
   >>>      esmLOCKprefetch( fp );   
   >>>      esmLOCKprefetch( tn );   
   >>>      if( !esmINTERFERENCE() )   
   >>>      {   
   >>>                    fp->next = fn;   
   >>>                    fn->prev = fp;   
   >>>                    to->next = fr;   
   >>>                    tn->prev = fr;   
   >>>                    fr->prev = to;   
   >>>      esmLOCKstore( fr->next,  tn );   
   >>>                    return TRUE;   
   >>>      }   
   >>>      return FALSE;   
   >>> }   
   >>>   
   >>> So, I guess, you are not talking about what My 66000 cannot do, but   
   >>> only what other ISAs cannot do.   
   >>   
   >> In my 40 years of SMP OS/HV work, I don't recall a   
   >> situation where 'MoveElement' would be useful or   
   >> required as an hardware atomic operation.   
   >   
   > The question is not would "MoveElement" be useful, but   
   > would it be useful to have a single ATOMIC event be   
   > able to manipulate {5,6,7,8} pointers in one event ??   
   >   
   >> Individual atomic "Remove Element" and "Insert/Append Element"[*], yes.   
   >> Combined?   Too inflexible.   
   >   
   > BOOLEAN InsertElement( Element *el, Element *to )   
   > {   
   >       tn = esmLOCKload( to->next );   
   >       esmLOCKprefetch( el );   
   >       esmLOCKprefetch( tn );   
   >       if( !esmINTERFERENCE() )   
   >       {   
   >                     el->next = tn;   
   >                     el->prev = to;   
   >                     to->next = el;   
   >       esmLOCKstore( tn->prev,  el );   
   >                     return TRUE;   
   >       }   
   >       return FALSE;   
   > }   
   >   
   > BOOLEAN RemoveElement( Element *fr )   
   > {   
   >       fn = esmLOCKload( fr->next );   
   >       fp = esmLOCKload( fr->prev );   
   >       esmLOCKprefetch( fn );   
   >       esmLOCKprefetch( fp );   
   >       if( !esmINTERFERENCE() )   
   >       {   
   >                     fp->next = fn;   
   >                     fn->prev = fp;   
   >                     fr->prev = NULL;   
   >       esmLOCKstore( fr->next,  NULL );   
   >                     return TRUE;   
   >       }   
   >       return FALSE;   
   > }   
   >   
   >>   
   >> [*] For which atomic compare-and-swap or atomic swap is generally   
   sufficient.   
   >>   
   >> Atomic add/sub are useful.  The other atomic math operations (min, max, etc)   
   >> may be useful in certain cases as well.   
      
   Have you ever read about KCSS?   
      
   https://groups.google.com/g/comp.arch/c/shshLdF1uqs   
      
   https://patents.google.com/patent/US7293143   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]