Giving the following constraints, is the optimization in this code correct?
- before an object is transfered to another thread, its reference count is incremented and
is_sharedis set to true. - everytime an object reaches a reference count of 1,
is_sharedis set to false. Header_is_uniqueis 100% thread safe.
The idea behind the optimization is to try reduce the amount of atomic operations.
Is this a fool's errand? Is this kind of optimization impossible to do in a thread safe manner without using more atomic values and/or mutexes, thus invalidating the idea of the optimization itself?
I only want to target the GCC and Clang compiler.
#include <stdint.h>
typedef struct Header {
uint8_t is_shared;
uint64_t reference_count;
} Header;
void
Header_acquire(
Header * const self
) {
if (!self->is_shared) {
++self->reference_count;
} else {
__atomic_fetch_add(&self->reference_count, 1, __ATOMIC_RELAXED);
}
}
uint8_t
Header_release(
Header * const self
) {
if (!self->is_shared) {
return !--self->reference_count;
}
uint64_t const reference_count =
__atomic_fetch_sub(&self->reference_count, 1, __ATOMIC_RELEASE);
if (reference_count == 2) {
__atomic_thread_fence(__ATOMIC_ACQUIRE);
self->is_shared = 0;
}
return reference_count == 1;
}
uint8_t
Header_is_unique(
Header const * const self
) {
return !self->is_shared && self->reference_count == 1;
}
is_sharedever becomes non-zero. \$\endgroup\$if (reference_count == 2)mean that now only single thread have reference to self, but this is *ANOTHER thread, not current. look more at stackoverflow.com/a/79320186/6401656 \$\endgroup\$