Forums before death by AOL, social media and spammers... "We can't have nice things"
|    comp.lang.c++.moderated    |    Moderated discussion of C++ superhackery    |    33,346 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 31,562 of 33,346    |
|    Pete Becker to Marc    |
|    Re: atomic counter    |
|    12 Oct 11 17:19:10    |
      From: pete@versatilecoding.com              On 2011-10-12 20:58:06 +0000, Marc said:              > Pete Becker wrote:       >       >> On 2011-10-09 12:27:42 +0000, Marc said:       >>       >>> I am trying to make our reference counting implementation thread-safe       >>> (for a very basic definition of thread-safe), which essentially means       >>> making the counter atomic. However, there are many options to atomic       >>> operations, in particular concerning the memory model, and I want to       >>> make sure I get it right.       >>>       >>> The operations I use are increment (no return), decrement (and check       >>> if the return value is 0), store (to initialize) and load (to check if       >>> the object is not shared and thus safe to write into).       >>>       >>> It looks to me like memory_order_relaxed should be good enough for       >>> this purpose, as I don't see what synchronization would be needed with       >>> the rest of memory, but I may be missing something fundamental there.       >>       >> Suppose there are two references to the object, in two different       >> threads. One thread decrements the reference count, then the other       >> does. If the decrement from the first thread isn't seen by the second       >> thread, the second thread won't see that the count has become zero, and       >> the object won't get destroyed. So memory_order_relaxed won't work: you       >> need to ensure that the result of a decrement is visible to another       >> thread that also needs to decrement the count.       >       > Uh? I am still calling an atomic decrement function. The standard says:       >       > "Note: Atomic operations specifying memory_order_relaxed are relaxed       > with respect to memory ordering. Implementations must still guarantee       > that any given atomic access to a particular atomic object be       > indivisible with respect to all other atomic accesses to that object."       >       > I thought the memory order was mostly concerned with what happened to       > the rest of the memory.       >       > Assuming your interpretation is correct, what is memory_order_relaxed       > good for?              Advanced threading. See Alexander Terekhov's message.              Atomic operations get completed without interruption. That ensures that       a different thread doesn't see a value that's not valid. For example,       suppose that storing a pointer takes two bus operations. If the pointer       starts out null, storing a value into it has two steps: store one half       of the pointer, then store the other half. If a context switch occurs       between those two steps, the thread that's switched to might see half       the pointer. Atomic operations ensure that that sort of tearing doesn't       occur.              The other aspect of threaded programming is visibility of changes.       Here's where you have to abandon the single-processor analogies; think       multiple processors. For example, suppose the system has two       processors, and each processor has its own data cache. Each processor       is noodling around with the same variable, so each cache has a copy of       the value of that variable. Writing a new value, even when done       atomically, only directly affects the value in the cache that belongs       to the processor that wrote the value. Unless the new value is copied       to the other processor's cache, the other processor will still see the       old value. memory_order_relaxed says "don't worry, be happy". It's okay       that the values are inconsistent.              >       >> It's easier to get the code right when it's sequentially consistent. In       >> general, unless you can demonstrate that synchronization is a       >> bottleneck, don't mess with it.       >       > Well yes, of course, I did some experiments (using boost::shared_ptr       > or the libstdc++ std::atomic (just a place-holder implementation, I       > know)), and the slow-down was unacceptable, which led me to       > reimplement it, and the performance hit is acceptable but still       > noticable enough that I am not sure about enabling it by default for       > MT programs (some programs have several threads that don't share any       > ref-counted objects and would pay the price for nothing).       >       > Our main target is x86/x86_64 where as far as I understand (please       > correct me if I am wrong) the memory barrier is unavoidable (implied       > by any atomic operation), but I am still interested in not penalizing       > our users on other platforms if I don't have to.              On the x86 architecture, pretty much everything is sequentially       consistent. So there's no difference in the generated code between       sequentially consistent visibility and any of the others. Which, in       turn, means that code that uses less than sequentially consistent       visiblity and works just fine on x86 systems may fail miserably if it's       just ported to other systems.              >       > And it is intellectually satisfying to understand things ;-)       >               |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca