Lesson 0 — Test gate — every app in this tier ships with passing tests

Question

Every lab in this course must ship with passing tests. No exceptions. The test gate is what turns "I read the code" into "I built it" — and the same harness pattern (forked anvil + Alloy provider + invariants) is reused across every lab.

Principle (minimum model)

The test-gate contract. Each lab has a tests/ directory with at least one passing test. CI fails if any test fails. The gate is a single line: cargo test --workspace --all-features.
Forked anvil is the default harness. Anvil::new().fork(MAINNET_RPC).fork_block_number(BLOCK).spawn() boots a deterministic anvil node forked at a pinned block. Tests run against real mainnet state without flakiness.
Pinned block numbers are non-negotiable. Without fork_block_number, the test depends on whatever mainnet looks like today → flaky. With it, the test is deterministic; reviewers can re-run identically.
Invariants over snapshots. Don't assert specific byte values; assert that conservation laws hold (amount + unfilled == shortfall, balance >= 0). Robust against future mainnet drift.
The forked_provider_at(rpc, block) helper. Every lab imports a tiny wrapper that boots forked anvil + returns a Provider. Centralised harness; per-lab boilerplate is ~3 lines.
Why tests, not videos. Building a working thing leaves an executable artifact. The test gate proves it works on someone else's machine, not just yours.

Worked example + steps

Test gate — every app in this tier ships with passing tests

You spent four tiers reading source. From here on you build. The temptation, after months of reading, is to write code, read it back, satisfy yourself it looks right, and move on. That is the failure mode this tier is engineered to prevent.

The rule for the rest of this tier: a lesson is not complete until its test suite is green. Not "I read the lesson, I built the thing, I think it works." Green tests, or you didn't ship it.

The reason: in Foundations and Intermediate you were reading code that someone else already proved correct (the Reth/Revm/Alloy maintainers run their own test suites). Reading exposes you to design choices but does not require you to defend any. The moment you write your own MEV searcher, indexer, or wallet backend, you become the proof of correctness for that code. Without tests you have no proof — only opinion.

This lesson sets the gate. The next 10 lessons each make you cross it.

What "tested" looks like, by app type

Each app in this tier has a different shape, so the tests look different. Here is the minimum bar per category:

App	Minimum test gate
MEV searcher	Forked-state test — replay a real historical opportunity and assert P&L is positive. Reorg test — your bundle survives or unwinds correctly across a 1-block reorg.
ExEx indexer (`tidx` walk-through)	Fixture chain replay — feed a known sequence of `Notification::ChainCommitted` / `ChainReverted` and assert your derived state matches a golden reference.
Custom RPC endpoint	Integration test — start the node in-process, hit your new method over HTTP, assert the JSON response. Error paths covered (bad params, missing block).
Wallet backend	Roundtrip test — signed tx decodes back to the original. Nonce invariant — sequential calls produce sequential nonces, never gaps or duplicates.
EIP-7702 sponsor	Replay-protection test — same auth tuple cannot be sponsored twice. Gas accounting test — sponsor pays the right amount, user pays zero.
Custom cheatcode	Differential test — your Rust precompile and a reference Solidity implementation produce the same output for 1000 fuzz inputs.
Swap aggregator	Forked-state test — quote against a real Uniswap V3 pool at a pinned block, assert output is within ε of a known-good quote.
Capstone (order router)	End-to-end fork test — submit an order, watch the router split / route / land / report fills.
Revm validation	Differential test — for every block in a small mainnet range, your Revm trace matches a non-Revm provider's `debug_traceTransaction` output.
Machine payments (HTTP 402)	Integration test — a request with no payment returns 402; with a valid micropayment, returns the resource. Replay-protection test — same payment cannot satisfy two requests.

Each row is the minimum. Real production systems layer fuzz, invariant, and chaos tests on top.

The scaffold

Every app in this tier follows the same scaffold:

my-app/
├── Cargo.toml          # workspace
├── src/
│   └── lib.rs          # the app code
├── tests/
│   ├── integration.rs  # crosses async / RPC / DB boundaries
│   └── fixtures/       # golden test inputs (tx hashes, block numbers, expected outputs)
├── foundry.toml        # if you have a Solidity surface
└── test/
    └── *.t.sol         # forge tests for the Solidity side

For pure-Rust apps (MEV searcher, indexer, wallet backend, sponsor): just Cargo.toml + src + tests.

For apps with a Solidity surface (custom cheatcode, swap aggregator, capstone): both Rust and Foundry test suites. The Solidity side uses everything you learned in Writing Tests with Foundry (Fundamentals tier) — vm.expectRevert, vm.expectEmit, fork tests, fuzz.

Two patterns that recur across the tier

Pattern 1 — pinned mainnet fork

Almost every app in this tier needs to test against real chain state. The pattern:

// Cargo.toml
[dev-dependencies]
alloy = { version = "...", features = ["providers"] }
revm = "..."

// tests/integration.rs
use alloy::providers::ProviderBuilder;

const PINNED_BLOCK: u64 = 18_500_000;
const FORK_RPC: &str = "https://eth.merkle.io";

#[tokio::test]
async fn searcher_finds_known_opportunity() {
    let provider = ProviderBuilder::new()
        .connect_http(FORK_RPC.parse().unwrap());

    // Build an AlloyDB-backed Revm at PINNED_BLOCK
    // Run your searcher against it
    // Assert the expected opportunity is found
    // Assert P&L matches the historical record
}

The pin is the discipline: PINNED_BLOCK is a constant in your repo. When you change it, every test reproduces against the new block. Without the pin, your tests are non-deterministic and CI is meaningless.

Pattern 2 — differential testing

When you build a Revm-based simulator (cheatcode, swap aggregator, validation app), correctness is not "my output looks right" — it is "my output matches a trusted reference for the same input." That reference is a non-Revm provider (Geth, Erigon, Alchemy's debug_trace).

#[tokio::test]
async fn simulator_matches_geth_debug_trace() {
    for tx_hash in HISTORICAL_TX_HASHES {
        let our_trace = our_simulator.trace(tx_hash).await;
        let geth_trace = alchemy_provider.debug_trace_transaction(tx_hash).await;
        assert_traces_equivalent(&our_trace, &geth_trace);
    }
}

Differential testing is the gold standard for any code that re-implements consensus-defined behavior. It is the only honest answer to "are you sure?"

What you ship at the end of each lesson

For every Building lesson, completion means a public-facing artifact with:

A repository (Git or local — your choice)
README describing what the app does
Cargo.toml (and foundry.toml if applicable) pinning all dependencies
src/ with the implementation
tests/ with the gate suite from the table above
A passing cargo test (and forge test if applicable) — locally and reproducibly
The pinned mainnet fork block (or fixture chain) recorded in the test file

If any of these is missing, the lesson is not complete. You ship the artifact and the proof together, or you do not ship.

A note on "I'll write tests later"

The most common reader objection at this point: "I'll prototype first, then add tests once the design stabilizes." This sounds reasonable. It is not.

In production EVM engineering, the test is not a verification of the code — it is the executable specification of what the code is supposed to do. Writing tests after prototyping forces you to derive the spec from the code, which means the spec is whatever the code happens to do, including its bugs. Writing tests first (or alongside) forces you to articulate the spec independently, then bend the code to it. Bug-finding follows naturally from the asymmetry.

The Reth, Revm, and Foundry maintainers all work test-first or test-alongside. There is no version of "production-quality EVM code" that comes from writing the code first and tests later. This tier holds you to the same standard.

Ready

Open the next lesson — Build a Minimal MEV Searcher in Rust — and read it through once. Then, before writing any searcher code, write the test from row 1 of the table above. Make it fail with no implementation. Then build until it passes.

That order — test first, code second — is the gate.

🧭 Where you are now in the stack: QA discipline is now the tier's gate. The 'prove with tests before shipping' standard that TigerBeetle, Cloudflare, and PostgreSQL all enforce, applied uniformly to every one of this tier's 10 apps. Next lesson starts building — MEV searcher first, with the test gate in front of the implementation.

Summary (3 lines)

Every Building lab ships with passing tests; CI gates on cargo test --workspace --all-features. The test is the proof of completion.
Forked anvil (Anvil::new().fork(...).fork_block_number(...)) + invariants is the canonical harness pattern. Pinned blocks make tests deterministic; invariants are robust over time.
forked_provider_at(rpc, block) is the shared helper; per-lab boilerplate is ~3 lines. Next lesson: Lab 1, MEV searcher.