The MESI protocol is a cache-coherence protocol that keeps the private caches of multiple cores agreeing on the value of every shared memory location. Each cache line carries one of four states — Modified, Exclusive, Shared, Invalid (the M-E-S-I that name it) — and transitions between them in response to local reads/writes and snooped bus traffic.
The reason it has to exist: in a multicore CPU each core has its own L1 (and often L2). If two cores both cache memory address and one writes, the other’s copy is now stale. Without coherence, threads on different cores would see different values for the same address — every shared variable would behave incorrectly.
The four states
Each cache line in each core’s cache is in exactly one of these states:
| State | Other caches | Memory consistent? | Can read locally? | Can write locally? |
|---|---|---|---|---|
| Modified (M) | Invalid | No (cache has the only fresh copy) | Yes | Yes |
| Exclusive (E) | Invalid | Yes | Yes | Yes (becomes M) |
| Shared (S) | Possibly Shared | Yes | Yes | No (must invalidate others first) |
| Invalid (I) | Any state | N/A | No (must fetch) | No |
- Modified: this cache is the only one with the line, and the line has been written since it was loaded. Memory is stale.
- Exclusive: this cache is the only one with the line, but it hasn’t been written; memory still matches. The line is “clean” and exclusive.
- Shared: this line exists clean in this cache and possibly others. Memory is up to date. Reads are local; writes need to invalidate other copies first.
- Invalid: not in this cache (or was evicted/invalidated). Any access misses.
State transitions
The state changes both on local actions (the CPU reads or writes the line) and on snooped actions (another core’s bus traffic involving the same line).
For each starting state and each event, the transition is fixed:
Local read.
- I → S or E. If snoop says no other cache has it, go to E. Otherwise go to S.
- S, E, M → unchanged. Read locally; no bus traffic.
Local write.
- I → M. Issue a “read-for-ownership” (RFO) on the bus, which invalidates other caches’ copies, then write.
- S → M. Issue an “invalidate” on the bus, then write locally.
- E → M. Just write; no bus traffic needed (we already had exclusive ownership).
- M → M. Just write; we already own it dirty.
Snoop: another cache reads this line.
- I, S → unchanged.
- E → S. Send the data to the requesting core, downgrade self to Shared.
- M → S. Send the data, write back to memory (so memory is fresh), downgrade to Shared.
Snoop: another cache writes this line (RFO).
- I → unchanged.
- S, E → I. The other cache is now the owner; we invalidate.
- M → I. Send the data first (so it’s not lost), then invalidate.
Snooping vs directory
The transitions above describe a snooping protocol: every cache watches all bus traffic, and each cache decides on its own which transitions to make based on what it sees and what state its lines are in. Snooping works on a shared bus where every transaction is visible to every cache — typical for a few cores on one chip.
For larger systems (dozens of cores, or multiple chips), the bus saturates. The alternative is a directory-based protocol: a central directory tracks which caches hold which lines, and only sends invalidations/updates to those caches. More complex but scales further. AMD’s HyperTransport and Intel’s QPI use directory-based coherence at the inter-socket level even when individual sockets snoop internally.
MESI vs related protocols
- MSI — three-state predecessor (no Exclusive). Every read of a line not present forces it to Shared, even if no other cache has it. The first write then takes a bus transaction to invalidate (no other cache has the line, but the protocol doesn’t know that). MESI’s Exclusive state lets a read-then-write sequence avoid the redundant invalidation.
- MOESI — adds Owner: like Modified (this cache has the only writable copy and memory is stale) but also like Shared (other caches may have read-only copies). Lets a Modified line be shared on a read without immediately writing back. AMD x86 uses MOESI.
- MESIF — adds Forward: in a Shared situation, exactly one cache is designated F (forwarder) and supplies the data on a snoop, avoiding multiple caches racing to respond. Intel x86 uses MESIF.
The state machines are larger and have more transitions, but the goal is the same: keep all caches agreeing on the value of every line, with as little bus traffic as possible.
Why this is hard
Coherence interacts with the rest of the memory system:
- False sharing. Two cores write to different variables that happen to share a cache line. Each write invalidates the other core’s copy. The variables aren’t logically shared, but the cache-line granularity makes them behave as if they were. The fix is padding to put them on different lines.
- Coherence misses. A “fourth C” beyond conflict — the line was here, but another core’s write invalidated it. Adds latency that doesn’t show up in single-threaded miss-rate analysis.
- Bus traffic. Every write to a Shared line costs an invalidate broadcast. Heavy writes to shared data can saturate the interconnect even at low miss rates.
- Consistency models. Coherence guarantees that all cores eventually agree on each individual line. It says nothing about the order in which a thread’s writes become visible to other threads — that’s the memory consistency model (sequential consistency, total store order, release-acquire, etc.). x86 is TSO; ARM is weaker. Memory barriers explicitly enforce ordering.
Where it sits
MESI runs in hardware on the cache controllers, invisible to software except through performance. From the program’s view, memory just works as if there were one consistent copy. The cost surfaces as cache misses, false-sharing slowdowns, and contention on hot shared lines — and as the surprising slowdowns that hit a multithreaded program that scales badly across cores.
For the underlying cache structure, see Cache memory and Cache address mapping.