I found the doc page: https://learn.microsoft.com/en-us/cpp/b ... w=msvc-170The thing is.. "volatile" is wrong thing for this.
So, acquire-read from volatile is the default on Intel (I guess mov from memory is always an acquire), but not necessarily elsewhere.
On any architecture you can get an interrupt after loading from pPendingOrRegaining and have producer code run a few times, so that when InterlockedExchangePointer loads from pPendingOrRegaining again, it's not the same value.It's because on some architectures it might be cached in an inconsistent state and have changed between the read and the branch.
To avoid thinking about that, let's assume for a moment that the producer always allocates a new buffer with malloc() or new, fills it with data, and writes the address into pPendingOrRegaining.
Code:
#include <atomic>#include <span>std::atomic<std::span<float>*> g_p_data = nullptr;void consume_data(std::span<float> data);void consumer() { std::span<float>* p_data = g_p_data.load(); if (p_data) { consume_data(*p_data); }}void produce_data(std::span<float> data);void producer() { size_t data_size = 256; float* data_buf = new float[data_size]; std::span<float>* p_data = new std::span<float>(data_buf, data_size); produce_data(*p_data); g_p_data.store(p_data);}
Now, without explicitly asking for acquire/release, gcc for ARMv7 surrounds ldr/str with dmb: https://godbolt.org/z/Mavhnhbcn , but I don't think it's needed for correctness in this case.
Another argument could be that when writing this stuff by hand without atomics, we'd also like to prevent the compiler from moving unrelated expensive computation into our "critical sections" (not literally in this case, but close enough), and using some kind of barrier intrinsic does that.
In any case, now that we can use atomics, there should be less room to screw things up. If in the original code we replace `DataSlot* volatile` with `std::atomic<DataSlot*>`, normal read with load(), InterlockedExchangePointer() with exchange(), and remove all MemoryBarrier(), I think it becomes a very nice portable implementation.
Statistics: Posted by Alien Brother — Sat Aug 17, 2024 5:12 pm