Lesson 2 — Reading add: factoring out the macro
Question
Real add uses two macros: popn_top! and gas!. Read both; understand the load-bearing details (cold_path / unwrap_unchecked / variable arity).
Principle (minimum model)
popn_top!([a, b], stack)macro. Pre-checks stack length;pops N values; binds the names; usesunwrap_uncheckedafter the length check (the unsafe is sound).unwrap_uncheckedjustification. The length check just-provedstack.len() >= N;unwrap_uncheckedremoves the runtime check. Pre-conditioned by the macro.cold_path()hint. Marks an unlikely branch (e.g. underflow) so the compiler doesn't reserve registers for it. Free perf.gas!(interp, opcode_cost)macro. Pre-checks gas; emits OOG if exhausted. Three lines collapsed into one macro.gas!for fixed-cost opcodes vsgas!_dynamic!for variable-cost (e.g. mstore that depends on memory expansion). Same shape; different cost model.- Why macros, not functions. Macros expand at compile time → no function-call overhead in the hot path. EVM opcode dispatch is hot; every nanosecond counts.
- Why these specific details. Each one removes a runtime check that a competent reader can prove redundant. The macros encode the proofs.
Worked example + steps
Reading add: factoring out the macro
Open crates/interpreter/src/instructions/arithmetic.rs and you'll see add, mul, sub, div, mod, lt, gt, eq, and, or, xor — 30+ binary opcodes. Every one of them starts with the same two lines of stack-popping boilerplate. That's a refactor begging to happen, and revm did it: one macro, popn_top!, replaces those two lines everywhere.
Last lesson, you built up to the hand-written version:
pub fn add<IT: ITy, H: ?Sized>(context: Ictx<'_, H, IT>) -> Result {
let op1 = context.interpreter.stack.pop().ok_or(StackUnderflow)?;
let op2 = context.interpreter.stack.last_mut().ok_or(StackUnderflow)?;
*op2 = op1.wrapping_add(*op2);
Ok(())
}
The real source replaces those middle two lines with one macro call:
pub fn add<IT: ITy, H: ?Sized>(context: Ictx<'_, H, IT>) -> Result {
popn_top!([op1], op2, context.interpreter);
*op2 = op1.wrapping_add(*op2);
Ok(())
}
This lesson is just that refactor. Why a macro, what's inside it, and why three small details inside earn their keep.
Step 1 — Why a macro at all
Every binary opcode begins with the same two lines:
let op1 = ctx.interpreter.stack.pop().ok_or(StackUnderflow)?;
let op2 = ctx.interpreter.stack.last_mut().ok_or(StackUnderflow)?;
Repeated 30+ times across the codebase. The question isn't whether to factor that — it's how.
Two reasons:
- Variable arity. Some opcodes pop 1, some pop 2, some pop 3. A macro matches
[op1],[op1, op2],[op1, op2, op3]with the same arm — a function would needpopn_top1,popn_top2,popn_top3, or const-generic gymnastics. - Direct early return. A function returning
Resultwould force?boilerplate at every call site. The macro emits areturn Err(StackUnderflow);that returns from the opcode function directly — no?, noResultplumbing.
Step 2 — A naive version of the macro
If you were writing it without thinking about the optimizer, you'd write:
macro_rules! popn_top_naive {
([ $($x:ident),* ], $top:ident, $interpreter:expr) => {
$(
let $x = $interpreter.stack.pop().ok_or(StackUnderflow)?;
)*
let $top = $interpreter.stack.last_mut().ok_or(StackUnderflow)?;
};
}
Read the syntax slowly:
$($x:ident),*matches a comma-separated list of identifiers (zero or more). With[op1], the list has one element. With[op1, op2], it has two.$( ... )*repeats whatever's inside per element of the list. Here it pops once per identifier.
That works. It's also slower than the real version, in two ways revm cares about.
Step 3 — Pre-check the underflow once
Calling .pop() N times means N internal bounds checks. Better: check once, up-front.
if $interpreter.stack.len() < (1 + $crate::_count!($($x)*)) {
return Err(StackUnderflow);
}
// ... now do the pops without re-checking
_count! is a helper macro that counts the identifiers in the repetition. For [op1], the guard becomes stack.len() < 2 (one popped + one mutable-borrowed). Once that guard passes, the subsequent pops are provably safe — we just verified there are enough items.
Step 4 — cold_path(): tell LLVM the failure branch is rare
Stack underflow is a bug, not a normal path. You don't want the rare-failure code in the hot instruction cache (the CPU's icache, where the bytes of the currently-executing function live). Cold instructions there evict the hot ones.
if $interpreter.stack.len() < (1 + $crate::_count!($($x)*)) {
$crate::primitives::hints_util::cold_path();
return Err(StackUnderflow);
}
It compiles to nothing at runtime. It's a compile-time hint to LLVM: "the code reachable through this branch is statistically rare." The optimizer responds by laying out the rare-branch code far from the hot path's machine instructions, keeping the hot path one straight line of cache-warm assembly.
Zero-cost optimization hint. That's the whole pattern.
Step 5 — unwrap_unchecked(): cash in the guard
Now we've manually verified stack.len() >= N. But Rust's pop() returns Option<T> — so naive code would write .unwrap() (panics on None) or .ok_or(...)? (re-checks). Both repeat the work the guard already did.
The real macro instead does:
let ([$( $x ),*], $top) = unsafe {
$crate::interpreter_types::StackTr::popn_top(&mut $interpreter.stack)
.unwrap_unchecked()
};
unwrap_unchecked() skips the runtime Some check. It's only safe when you can prove the value is Some — and the guard we wrote in Step 3 just proved exactly that. The unsafe block is the contract: "I checked, so don't double-check." Delete the guard and you've made it instant UB.
The compiler can't prove the relationship between stack.len() >= N and popn_top returning Some — that's a domain invariant (we know what popn_top does), not a type invariant the type system can see. unwrap_unchecked is the seam between domain knowledge and type-system limits — how you tell the compiler "trust me, I checked."
Step 6 — The full popn_top!
Putting it all together:
macro_rules! popn_top {
([ $($x:ident),* ], $top:ident, $interpreter:expr) => {
if $interpreter.stack.len() < (1 + $crate::_count!($($x)*)) {
$crate::primitives::hints_util::cold_path();
return Err($crate::InstructionResult::StackUnderflow);
}
let ([$( $x ),*], $top) = unsafe {
$crate::interpreter_types::StackTr::popn_top(&mut $interpreter.stack)
.unwrap_unchecked()
};
};
}
Three details, each earning its keep:
cold_path()— keeps the rare-failure code out of the hot icache (zero-cost hint)unwrap_unchecked— skips the runtime check the guard already did- The arity-N matcher — one macro for any opcode that pops N
🔍 Find in repo. Open
crates/interpreter/src/instructions/macros.rs. Findpopn_top!. Confirm what we just walked is what's in the file (modulo formatting).
Step 7 — gas!: the same pattern, applied elsewhere
macro_rules! gas {
($interpreter:expr, $gas:expr) => {
if !$interpreter.gas.record_regular_cost($gas) {
$crate::primitives::hints_util::cold_path();
return Err($crate::InstructionResult::OutOfGas);
}
};
}
Same shape: check, cold-hint on failure, return early. Charge gas; fall off the cliff if you can't afford it. Once you've internalized popn_top!, gas! is the same pattern in five lines.
🔍 Find in repo. Why isn't
gas!called inside the body ofadd? Look atarithmetic.rs. Form a hypothesis. Then openinterpreter.rsand find where constant-gas opcodes are charged.
Hint: add has a fixed gas cost (3 in current Ethereum). Fixed costs get paid up-front by the dispatch loop, before each opcode function runs. Only opcodes with operand-dependent costs (exp, sha3, the memory-touching ops) charge inside their bodies — you'll meet one such case in the drill.
Recall before the quiz
Without scrolling:
- Why is
popn_top!a macro instead of a function? (Name one mechanical reason.) - What does
cold_path()compile to at runtime? - Why is
unwrap_uncheckednot UB insidepopn_top!? - What's the structural relationship between
popn_top!andgas!?
The next lesson is a quiz that gates progression. You can't nod through a quiz — engage with these recalls now if any answer is shaky.
📺 Further watching
Nh19f_2fWLc | Dragan Rakita — EVM Technical walkthrough
Summary (3 lines)
- Real
add=popn_top!([a, b], stack)+gas!(interp, opcode_cost)+additself. Macros collapse 5-10 lines each. - Load-bearing: unwrap_unchecked (post-length-check) + cold_path() (branch hint) + variable arity (single macro for many opcodes).
- Macros over functions because compile-time expansion = no function-call overhead in hot path. Next: quiz.