Atomic Operations & CAS: Lock-Free Magic
Before We Start โ What You Need to Know
So far we've learned two ways to handle concurrency safely:
- Immutability โ don't change anything
- Thread confinement โ don't share anything
But what if you *must* share data *and* mutate it? A counter that all threads increment. A flag that any thread can toggle. A reference that gets swapped atomically. For these, we need atomic operations โ operations that are guaranteed to complete as a single, indivisible step, even under concurrency.
You should know: what a thread is, why counter++ on a shared variable is dangerous, and what a race condition means.
What is it? (The Analogy)
Imagine you're at a busy deli counter. There's a take-a-number machine. You walk up, press the button, and get ticket #47. The next person gets #48. Even if 10 people press the button at nearly the same time, nobody gets the same number. The machine guarantees that each press is an *atomic* operation โ it reads the current number, increments it, and dispenses the ticket in one indivisible step.
Now imagine instead there's just a whiteboard with a number on it. To get your ticket, you read the number (47), erase it, and write 48. But what if two people read "47" at the same time? They both write "48", and you've lost a number! That's the problem with non-atomic operations.
Atomic operations are like the take-a-number machine โ they guarantee that read-modify-write happens in one unbreakable step, no matter how many threads are trying simultaneously.
The Problem It Solves
The classic broken counter:
1// BROKEN: Race condition on counter++
2class BrokenHitCounter {
3 private int count = 0;
4
5 public void recordHit() {
6 count++; // This is NOT atomic!
7 }
8
9 public int getCount() {
10 return count;
11 }
12}**Why is count++ broken?** It looks like one operation, but it's actually THREE:
- Read: Load the current value of
countfrom memory (say, 42) - Modify: Add 1 to get 43
- Write: Store 43 back to memory
If Thread A and Thread B both do count++ at the same time:
Thread A: READ count (42)
Thread B: READ count (42) <-- both read 42!
Thread A: WRITE count = 43
Thread B: WRITE count = 43 <-- overwrites A's write!
// Expected: 44, Got: 43 โ one increment was LOSTThe synchronized fix works but is heavy:
public synchronized void recordHit() {
count++; // Only one thread at a time โ works but slow
}Atomic classes give us a better way.
How it works โ Step by Step
- **Use
AtomicInteger(orAtomicLong,AtomicReference, etc.)** instead of a rawint. - **Call
incrementAndGet()orgetAndIncrement()** โ these methods are guaranteed to be atomic. - Under the hood, Java uses CAS (Compare-And-Swap) โ a special CPU instruction that does a read-compare-write in a single hardware operation.
- CAS works like this: "If the current value is still X (what I last read), change it to Y. If someone else changed it since I read it, *fail* and let me try again."
- The retry loop (spin loop): If CAS fails (because another thread snuck in), Java automatically retries โ read the new value, compute the update, try CAS again. This is called optimistic concurrency โ you optimistically assume no conflict, and only retry if there was one.
Aha! CAS is a *hardware instruction* (like CMPXCHG on x86 processors). It's not implemented with locks โ it's built into the CPU itself. That's why it's called "lock-free" โ you get atomicity without ever acquiring a lock or blocking a thread.
Let's Build It Together
Let's build a thread-safe scoreboard for a multiplayer game:
1import java.util.concurrent.atomic.AtomicInteger;
2import java.util.concurrent.atomic.AtomicReference;
3import java.util.concurrent.atomic.AtomicIntegerArray;
4
5public class GameScoreboard {
6
7 // Atomic counter โ each player's score
8 private final AtomicInteger redTeamScore = new AtomicInteger(0);
9 private final AtomicInteger blueTeamScore = new AtomicInteger(0);
10
11 // Atomic reference โ the current MVP (can change at any time)
12 private final AtomicReference<String> currentMVP =
13 new AtomicReference<>("Nobody yet");
14
15 // Atomic array โ per-round kill counts (10 rounds)
16 private final AtomicIntegerArray roundKills = new AtomicIntegerArray(10);
17
18 /**
19 * Thread-safe scoring. Multiple threads can call this simultaneously.
20 */
21 public int addRedScore(int points) {
22 // addAndGet is atomic: read + add + write in one step
23 return redTeamScore.addAndGet(points);
24 }
25
26 public int addBlueScore(int points) {
27 return blueTeamScore.addAndGet(points);
28 }
29
30 /**
31 * Atomically update the MVP only if the new player has more kills.
32 * This demonstrates CAS (Compare-And-Swap) explicitly.
33 */
34 public boolean tryUpdateMVP(String currentExpected, String newMVP) {
35 // compareAndSet: "If the current value is currentExpected,
36 // set it to newMVP and return true.
37 // Otherwise, return false (someone else changed it)."
38 return currentMVP.compareAndSet(currentExpected, newMVP);
39 }
40
41 /**
42 * Record a kill in a specific round โ array access, still atomic!
43 */
44 public void recordKill(int round) {
45 roundKills.incrementAndGet(round);
46 }
47
48 /**
49 * Demonstrate a CAS retry loop for a custom operation:
50 * "Double the red team's score" (atomically!)
51 */
52 public int doubleRedScore() {
53 while (true) { // CAS retry loop
54 int current = redTeamScore.get(); // Step 1: Read
55 int newValue = current * 2; // Step 2: Compute
56 if (redTeamScore.compareAndSet(current, newValue)) {
57 // Step 3: CAS succeeded โ nobody changed it while we computed
58 return newValue;
59 }
60 // CAS failed โ someone else modified the score between
61 // our read and our write. Loop back and try again!
62 // This is the "optimistic retry" pattern.
63 }
64 }
65
66 public void printScoreboard() {
67 System.out.println("=== SCOREBOARD ===");
68 System.out.println("Red Team: " + redTeamScore.get());
69 System.out.println("Blue Team: " + blueTeamScore.get());
70 System.out.println("MVP: " + currentMVP.get());
71 System.out.print("Kills per round: ");
72 for (int i = 0; i < roundKills.length(); i++) {
73 System.out.print("R" + i + "=" + roundKills.get(i) + " ");
74 }
75 System.out.println();
76 }
77
78 public static void main(String[] args) throws InterruptedException {
79 GameScoreboard board = new GameScoreboard();
80
81 // 20 threads, each adding random scores
82 Thread[] threads = new Thread[20];
83 for (int i = 0; i < 20; i++) {
84 final int threadId = i;
85 threads[i] = new Thread(() -> {
86 for (int j = 0; j < 1000; j++) {
87 if (threadId % 2 == 0) {
88 board.addRedScore(1);
89 } else {
90 board.addBlueScore(1);
91 }
92 board.recordKill(j % 10);
93 }
94 });
95 threads[i].start();
96 }
97
98 for (Thread t : threads) t.join();
99
100 board.printScoreboard();
101 // Red: 10,000 (10 threads x 1000)
102 // Blue: 10,000 (10 threads x 1000)
103 // Total kills: 20,000 spread across 10 rounds
104 // ALL numbers are EXACT โ no lost updates!
105 }
106}What Happens Under the Hood
Let's trace what happens when Thread A calls redTeamScore.incrementAndGet():
- Thread A reads the value: Loads
42from theAtomicInteger's internalvolatile int valuefield. - Thread A computes:
42 + 1 = 43. - Thread A executes CAS: Calls the native CPU instruction
CMPXCHG(compare-and-exchange):
- "Is the value still 42? If yes, set it to 43."
- The CPU locks the cache line (not a software lock โ a hardware bus lock or cache-line lock) for this single instruction. This is *much* faster than a software synchronized block.
- If CAS succeeds: The value is now 43. Thread A continues.
- If CAS fails (Thread B changed it to 43 first): Thread A loops back, reads the new value (43), computes 44, and tries CAS again. This usually succeeds on the second try.
**Why is this faster than synchronized?**
synchronizedrequires the OS kernel to manage a lock (heavyweight) โ context switches, thread parking/unparking.- CAS is a single CPU instruction with no kernel involvement. It completes in nanoseconds.
- CAS only "fails" when there's actual contention. Under low contention (the common case), it succeeds on the first try.
**The volatile underneath**: Inside AtomicInteger, the value is declared private volatile int value. The volatile keyword ensures visibility โ when one thread writes the value, all other threads immediately see the update (no stale CPU cache values).
Danger zone: The ABA Problem. CAS checks "is the value still X?" But what if it was X, changed to Y, then changed *back to X* before your CAS runs? CAS sees X and thinks nothing changed โ but something did! This is the ABA problem. For simple counters it doesn't matter, but for lock-free data structures (linked lists, stacks), it can cause bugs. Java provides AtomicStampedReference (which tracks a version stamp alongside the value) to solve this.
When to Use vs When NOT to Use
| Use Atomics When | Don't Use Atomics When |
|---|---|
| Simple counters, flags, and references | You need to update multiple related variables atomically (use locks) |
| Low-contention scenarios (most of the time) | Very high contention (many threads hitting the same atomic โ the spin loop wastes CPU) |
| You want lock-free, non-blocking code | The update logic is complex (CAS retry loops become hard to reason about) |
| Performance-critical hot paths | You need fairness guarantees (CAS can "starve" unlucky threads) |
| Statistics, metrics, hit counters | You need to wait/block for a condition (use Condition or Semaphore) |
Common Mistakes & Gotchas
- **Thinking
count++is atomic on anAtomicInteger.** You must callcount.incrementAndGet(), notcount.get() + 1followed bycount.set(). The separate get/set has the same race condition as a raw int! - **Using
AtomicReferencethinking it deep-compares objects.** CAS onAtomicReferencecompares by identity (==), not byequals(). If you create a new object with the same content, CAS will not consider it equal to the old one. - Forgetting the ABA problem. For counters it's fine. For pointer-swapping in lock-free data structures, use
AtomicStampedReference. - Over-using atomics. If you have 5 fields that must be updated together consistently, you need a lock (or an immutable snapshot object swapped via
AtomicReference). Atomics protect *one* value at a time. - Spin loops under high contention. If 100 threads compete on the same
AtomicInteger, most CAS attempts fail, and threads waste CPU spinning. For high-contention counters, useLongAdderinstead โ it spreads the contention across multiple cells.
Interview Tip
When an interviewer asks about lock-free programming or CAS, explain the three-step dance: (1) read current value, (2) compute new value, (3) CAS โ and if CAS fails, loop back to step 1. Show you understand *why* it's faster than locks (no kernel involvement, no context switches), and *when* it breaks down (high contention, multi-variable updates). Bonus points for mentioning LongAdder for high-contention counters and AtomicStampedReference for the ABA problem.
Quick Quiz
- Why can't you just use
volatile int countand docount++? Doesn'tvolatilemake it thread-safe?
- What happens if 1000 threads all try to
incrementAndGet()on the sameAtomicIntegersimultaneously? How many retries would you expect?
- You need to atomically update both a player's name and score together. Can you do this with two separate
AtomicInteger/AtomicReference? What's a better approach?
Summary โ Key Takeaways
- Atomic classes (
AtomicInteger,AtomicLong,AtomicReference) provide lock-free, thread-safe operations for single variables. - CAS (Compare-And-Swap) is the secret sauce โ a CPU-level instruction that checks and updates a value in one indivisible step.
- Use the CAS retry loop pattern for custom atomic operations: read, compute, CAS, retry on failure.
- For high-contention counters, prefer
LongAdderoverAtomicLongโ it distributes the load across multiple internal cells.