How do you get access to machine-level atomic increments?
On modern machines, I thought that atomic increment could be made as fast as normal increment with relaxed ordering. However,
AtomicUsize::fetch_add(1, Ordering::Relaxed) seems to be 5-10x slower than a non-atomic increment.
I ran an
AtomicUsize on both Mac ARM-1 and AWS t3.large. I believe both machines have instructions for atomic incrmements.