... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
comp.arch
Apparently more than just beeps & boops
131,241 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 130,440 of 131,241
Chris M. Thomasson to David Brown
Re: Memory ordering (Re: Multi-precision
05 Dec 25 15:03:53
   From: chris.m.thomasson.1@gmail.com   
      
   On 12/5/2025 11:10 AM, David Brown wrote:   
   > On 05/12/2025 18:57, MitchAlsup wrote:   
   >>   
   >> anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   >>   
   >>> David Brown  writes:   
   >>>> "volatile" /does/ provide guarantees - it just doesn't provide enough   
   >>>> guarantees for multi-threaded coding on multi-core systems.  Basically,   
   >>>> it only works at the C abstract machine level - it does nothing that   
   >>>> affects the hardware.  So volatile writes are ordered at the C level,   
   >>>> but that says nothing about how they might progress through storage   
   >>>> queues, caches, inter-processor communication buses, or whatever.   
   >>>   
   >>> You describe in many words and not really to the point what can be   
   >>> explained concisely as: "volatile says nothing about memory ordering   
   >>> on hardware with weaker memory ordering than sequential consistency".   
   >>> If hardware guaranteed sequential consistency, volatile would provide   
   >>> guarantees that are as good on multi-core machines as on single-core   
   >>> machines.   
   >>>   
   >>> However, for concurrent manipulations of data structures, one wants   
   >>> atomic operations beyond load and store (even on single-core systems),   
   >>   
   >> Such as ????   
   >   
   > Atomic increment, compare-and-swap, locks, loads and stores of sizes   
   > bigger than the maximum load/store size of the processor.   
      
   It's strange that double-word compare and swap (DWCAS), where the words   
   are contiguous. Well, I have seen compilers say its not lock-free even   
   on a x86. for a 32 bit system we have cmpxchg8b. For a 64 bit system   
   cmpxchg16b. But the compiler reports not lock free. Strange.   
      
   using cmpxchg instead of xadd:   
   https://forum.pellesc.de/index.php?topic=7167.0   
      
   trying to tell me that a DWCAS is not lock free:   
   https://forum.pellesc.de/index.php?topic=7311.msg27764#msg27764   
      
   This should be lock-free on an x86, even x64:   
      
   struct ct_proxy_dwcas   
   {   
        struct ct_proxy_node* node;   
        intptr_t count;   
   };   
      
   some of my older code:   
      
   AC_SYS_APIEXPORT   
   int AC_CDECL   
   np_ac_i686_atomic_dwcas_fence   
   ( void*,   
      void*,   
      const void* );   
      
      
   np_ac_i686_atomic_dwcas_fence PROC   
      push esi   
      push ebx   
      mov esi, [esp + 16]   
      mov eax, [esi]   
      mov edx, [esi + 4]   
      mov esi, [esp + 20]   
      mov ebx, [esi]   
      mov ecx, [esi + 4]   
      mov esi, [esp + 12]   
      lock cmpxchg8b qword ptr [esi]   
      jne np_ac_i686_atomic_dwcas_fence_fail   
      xor eax, eax   
      pop ebx   
      pop esi   
      ret   
      
   np_ac_i686_atomic_dwcas_fence_fail:   
      mov esi, [esp + 16]   
      mov [esi + 0],  eax;   
      mov [esi + 4],  edx;   
      mov eax, 1   
      pop ebx   
      pop esi   
      ret   
   np_ac_i686_atomic_dwcas_fence ENDP   
      
      
   > Even with a   
   > single core system you can have pre-emptive multi-threading, or at least   
   > interrupt routines that may need to cooperate with other tasks on data.   
   >   
   >>   
   >>> and I don't think that C with just volatile gives you such guarantees.   
   >>>   
   >>> - anton   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]