From: chris.m.thomasson.1@gmail.com   
      
   On 12/5/2025 11:10 AM, David Brown wrote:   
   > On 05/12/2025 18:57, MitchAlsup wrote:   
   >>   
   >> anton@mips.complang.tuwien.ac.at (Anton Ertl) posted:   
   >>   
   >>> David Brown writes:   
   >>>> "volatile" /does/ provide guarantees - it just doesn't provide enough   
   >>>> guarantees for multi-threaded coding on multi-core systems. Basically,   
   >>>> it only works at the C abstract machine level - it does nothing that   
   >>>> affects the hardware. So volatile writes are ordered at the C level,   
   >>>> but that says nothing about how they might progress through storage   
   >>>> queues, caches, inter-processor communication buses, or whatever.   
   >>>   
   >>> You describe in many words and not really to the point what can be   
   >>> explained concisely as: "volatile says nothing about memory ordering   
   >>> on hardware with weaker memory ordering than sequential consistency".   
   >>> If hardware guaranteed sequential consistency, volatile would provide   
   >>> guarantees that are as good on multi-core machines as on single-core   
   >>> machines.   
   >>>   
   >>> However, for concurrent manipulations of data structures, one wants   
   >>> atomic operations beyond load and store (even on single-core systems),   
   >>   
   >> Such as ????   
   >   
   > Atomic increment, compare-and-swap, locks, loads and stores of sizes   
   > bigger than the maximum load/store size of the processor.   
      
   It's strange that double-word compare and swap (DWCAS), where the words   
   are contiguous. Well, I have seen compilers say its not lock-free even   
   on a x86. for a 32 bit system we have cmpxchg8b. For a 64 bit system   
   cmpxchg16b. But the compiler reports not lock free. Strange.   
      
   using cmpxchg instead of xadd:   
   https://forum.pellesc.de/index.php?topic=7167.0   
      
   trying to tell me that a DWCAS is not lock free:   
   https://forum.pellesc.de/index.php?topic=7311.msg27764#msg27764   
      
   This should be lock-free on an x86, even x64:   
      
   struct ct_proxy_dwcas   
   {   
    struct ct_proxy_node* node;   
    intptr_t count;   
   };   
      
   some of my older code:   
      
   AC_SYS_APIEXPORT   
   int AC_CDECL   
   np_ac_i686_atomic_dwcas_fence   
   ( void*,   
    void*,   
    const void* );   
      
      
   np_ac_i686_atomic_dwcas_fence PROC   
    push esi   
    push ebx   
    mov esi, [esp + 16]   
    mov eax, [esi]   
    mov edx, [esi + 4]   
    mov esi, [esp + 20]   
    mov ebx, [esi]   
    mov ecx, [esi + 4]   
    mov esi, [esp + 12]   
    lock cmpxchg8b qword ptr [esi]   
    jne np_ac_i686_atomic_dwcas_fence_fail   
    xor eax, eax   
    pop ebx   
    pop esi   
    ret   
      
   np_ac_i686_atomic_dwcas_fence_fail:   
    mov esi, [esp + 16]   
    mov [esi + 0], eax;   
    mov [esi + 4], edx;   
    mov eax, 1   
    pop ebx   
    pop esi   
    ret   
   np_ac_i686_atomic_dwcas_fence ENDP   
      
      
   > Even with a   
   > single core system you can have pre-emptive multi-threading, or at least   
   > interrupt routines that may need to cooperate with other tasks on data.   
   >   
   >>   
   >>> and I don't think that C with just volatile gives you such guarantees.   
   >>>   
   >>> - anton   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   
|