C11 Implementation of Spinlock using header atomic
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
A spinlock is a minimal lock that repeatedly checks a flag until the lock becomes available. In C11, the stdatomic.h header provides the atomic primitives needed to build one safely. This article shows a correct implementation, explains memory ordering, and highlights when spinlocks should not be used.
When a Spinlock Is Appropriate
Spinlocks can work well when critical sections are very short and contention is low. They avoid kernel context switches, which can reduce latency in tight loops.
They are usually a bad choice when lock hold times are long, thread count is high, or single-core environments are common. In those cases, mutexes generally perform better and waste less CPU.
Basic C11 Spinlock with atomic_flag
C11 provides atomic_flag, a lock-free primitive designed for test-and-set style synchronization.
memory_order_acquire ensures reads and writes inside the critical section are not moved before lock acquisition. memory_order_release ensures updates become visible before lock release.
Add CPU-Friendly Backoff
A tight empty loop can saturate CPU. Add backoff or architecture hints in the wait loop.
thrd_yield gives scheduler flexibility under contention and can improve overall throughput.
Example Usage with Shared Counter
Use the lock to guard shared mutable state.
Compile with a C11-capable compiler and threading support.
Correctness Notes
A spinlock must never be copied while in use. Keep one stable instance and share its address. Also ensure every lock acquisition has a matching unlock, including error paths.
Spinlocks are non-recursive by default. If a thread tries to lock the same spinlock twice, it deadlocks itself.
Performance Guidance
Measure with your real workload. A microbenchmark with short synthetic loops can be misleading.
- Check contention level and lock hold time.
- Compare against
mtx_tfrom C11 threads. - Profile CPU utilization, not just wall-clock latency.
Often a standard mutex wins once critical sections include I/O, memory allocation, or cache-miss-heavy work.
Add Non-Blocking Try Lock
Sometimes callers should skip work if the lock is busy instead of spinning. A try-lock helper can improve responsiveness.
Callers can use this in polling loops or best-effort cache updates where blocking is unnecessary. Be careful not to create starvation patterns where one thread repeatedly wins and others never progress. If fairness matters, switch to a queue-based lock or OS mutex.
Common Pitfalls
- Using relaxed memory order for lock and unlock operations, which can break visibility guarantees.
- Spinning for long critical sections and wasting CPU cycles.
- Forgetting backoff or yielding under high contention.
- Using spinlocks in power-sensitive environments where busy wait is expensive.
- Assuming spinlocks are always faster than mutexes.
Summary
- C11
atomic_flagenables a compact and correct spinlock implementation. - Acquire and release ordering are essential for memory safety.
- Add backoff or
thrd_yieldto reduce contention pressure. - Use spinlocks only for very short critical sections with low contention.
- Benchmark against mutex-based alternatives before deciding.

