... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"

comp.arch

Apparently more than just beeps & boops

131,241 messages

[ << oldest | < older | list | newer > | newest >> ]

Message 130,710 of 131,241

BGB to Chris M. Thomasson

Re: Variable-length instructions

30 Dec 25 23:21:20

   From: cr88192@gmail.com   
      
   On 12/30/2025 4:58 PM, Chris M. Thomasson wrote:   
   > On 12/30/2025 11:10 AM, BGB wrote:   
   > [...]   
   >   
   >> But, then again, weak model is cheaper to implement and generally   
   >> faster, although explicit synchronization is annoying and such a model   
   >> is incompatible with "lock free data structures" (which tend to   
   >> implicitly assume that memory accesses occur in the same order as   
   >> written and that any memory stores are immediately visible across   
   >> threads).   
   >   
   > Fwiw, a weak memory model is totally compatible with lock-free data   
   > structures. A weak model tends to have the necessary memory barriers to   
   > make them work. Have you ever used a SPARC in RMO mode? Acquire membar   
   > ala std::memory_order_acquire is basically a MEMBAR #LoadStore |   
   > #LoadLoad. A release is MEMBAR #LoadStore | #StoreStore. Those can be   
   > used for the implementation of a mutex. Notice how acquire and release   
   > never need #StoreLoad ordering?   
   >   
   > The point is that once we have this flexibility, a lock/wait free algo   
   > can use the right membars for the job. Ideally, the weakest membars they   
   > can use to ensure they are correct in their logic.   
   >   
      
   Usually IME the people writing lock-free code don't use memory barriers   
   or similar though. A lot of times IME, it is people just using volatile   
   or similar and trying to write things in a way that it (hopefully) wont   
   go terribly wrong if two threads hit the same data at the same time.   
      
   Like, the sort of code that works on a PC running Windows or similar,   
   but try to port it to Linux on an ARM machine, and it explodes.   
      
      
   Where, say, using volatile isn't sufficient for multiple cores with a   
   weak model. One would need either to use barriers (though, in my case,   
   barriers will also be slow), non-cached memory accesses, or explicit   
   cache-line flushing.   
      
      
   In this case, this leaves it often preferable to use bulk mostly   
   read-only data sharing. Or, passing along data via buffers or messages   
   (with some level of basic flow control).   
      
   So, not so much "lets have two threads share a doubly-linked list and   
   hope it doesn't all turn into a train wreck", and more "will copy   
   messages onto the end of a circular buffer and advance the roving   
   pointers; manually flushing the lines corresponding to the parts of the   
   buffer than have been updated in the process".   
      
   Say, for example:   
      void _flushbuffer(void *data, size_t sz)   
      {   
        char *ct, *cte;   
        ct=data; cte=ct+sz;   
        while(ct [...]   
   >   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)

[ << oldest | < older | list | newer > | newest >> ]