Lesson 4 — Drill: read SenderRecoveryStage end-to-end

Question

Read SenderRecoveryStage line by line. ~150 lines of Rust. Sees: rayon par_iter, batched DB writes, atomic checkpoint updates.

Principle (minimum model)

Structure. pub struct SenderRecoveryStage { batch_size: usize }. Stateless except for batch size.
execute method. Read uncomputed senders from block_meta table; batch process via recover_signers(...); write back to tx_senders table.
Rayon parallelism. senders.par_iter().map(|tx| recover_signer(tx)) — embarrassingly parallel; uses all cores.
Batched DB writes. Group all writes per batch; one transaction per batch. Reduces lock contention.
Checkpoint. Update tx_senders.last_processed_block atomically with the batch write.
Tests use TestStageDB. In-memory DB + injected blocks; assert stage produces correct senders. ~50 lines per test.
Why this drill. SenderRecovery is the simplest non-trivial stage. Read it; see the pattern; apply it to every other stage.

Worked example + steps

Drill: read `SenderRecoveryStage` end-to-end

Reading is rehearsal. Doing is memory. This drill takes you from "I've read about staged sync" to "I have read SenderRecoveryStage line by line and answered three architectural questions about it from the source."

Setup

git clone https://github.com/paradigmxyz/reth
cd reth

You don't need to build it — this is a reading drill, not a compile drill.

The target file

crates/stages/stages/src/stages/sender_recovery.rs

Open it. We'll work through it in order.

Drill 1 — Find the `Stage` impl

Open the file. Find impl<Provider> Stage<Provider> for SenderRecoveryStage. The execute method is your target.

Skim the method body. Identify three sections:

Read — pull tx envelopes for blocks in the input range from MDBX
Compute — ECDSA-recover senders for each tx (this is where Rayon enters)
Write — write recovered senders back to MDBX, update checkpoint

If your sentences from the predict prompt missed the read/compute/write split, scroll back and re-read the build-up lesson's Step 1 — that shape is the entire pipeline pattern, not unique to this stage.

Drill 2 — Find the batch loop

The stage doesn't process every block in ExecInput.target at once. It batches.

🔍 Find the batch loop. Search for commit_threshold or chunk or batch in the file.

Two reasons:

Memory. Holding 10M signatures' worth of envelope buffers in RAM is expensive. Batches keep the working set bounded.
Backpressure. After each batch, the stage can return done: false and let the orchestrator decide whether to commit and move on, or call again. Without batching, the stage commits everything or nothing.

The commit_threshold field on the stage struct controls the batch size. Find its default value — that's a tunable that matters in production.

Drill 3 — Find where `done: false` is returned

Search for done: false or ExecOutput { done in the method body.

When the stage has processed all blocks up to ExecInput.target (no more work in this range). Until then, done: false tells the orchestrator "call me again on the next batch." Once true, the orchestrator advances to the next stage.

Drill 4 — Find the Rayon parallelism

Search for par_iter or rayon:: in the file.

🔍 Question: Where does Rayon enter? On what data?

It's on the inner ECDSA recovery loop — usually shaped like:

chunk.par_iter()
    .map(|tx| recover_signer(tx))
    .collect::<Vec<_>>()

Each transaction's sender recovery is independent → safe to fan across cores → Rayon does the work.

It scales sub-linearly. With more transactions per block, each Rayon batch grows but core count stays the same — wall-clock time grows roughly linearly with total signatures, but per-batch overhead (chunking, channel coordination) is amortized over more work, so total throughput improves slightly. Net: ~15–18× slower for 20× more signatures, depending on cache behavior.

End-of-lesson recall

Without scrolling, in your own words:

What's the read/compute/write structure of SenderRecoveryStage::execute?
What does commit_threshold control, and why does it exist?
Why is Rayon's parallelism applied to ECDSA recovery and not (say) MDBX writes?
Why does done: false exist as a return state at all? What would break if every execute had to finish the whole range?

If any answer is shaky, the lesson isn't done with you. Re-read the relevant build-up step or re-open the file.

Drill 5 — Watch the stage execute via `tracing` (optional)

Everything above was reading. One step of running and watching. Reth instruments every stage with tracing spans / events. Run with debug-level logging and you can see, line by line, what stages do in a real node:

# From the reth repo root:
RUST_LOG=reth_stages=debug,reth_stages_api=debug \
  cargo run --bin reth --release -- node --dev --dev.block-time 5s

--dev boots a single-node devnet; --dev.block-time 5s mines a block every 5 s. Stage transition logs start streaming — headers, bodies, sender_recovery, execution, hashing, merkle, tx_lookup spans run in the order you read them in the build-up, for each block.

This is the real thing. The "return done: false to backpressure the orchestrator" model from the build-up shows up directly in the tracing output. The match between mental model and log output is what promotes the lesson from "I understood it" to "I understood it because I saw it work."

After this drill, you've read the same code Paradigm uses to keep Reth in sync.

Summary (3 lines)

SenderRecoveryStage = rayon par_iter + batched DB writes + atomic checkpoint. ~150 lines.
Embarrassingly parallel (per-tx signature recovery). Tests use TestStageDB.
Pattern transfers to every other stage. Next: Rust lifetimes / Arc / dyn deep dive.