Lesson 10 — Companion traits, optimizations, and real impls

Question

Database has 3 companion traits + 2 optimisations + 3 real implementations. Read each.

Principle (minimum model)

DatabaseRef (read-only). Same methods as Database but &self (immutable). Lets multiple readers share via Arc.
DatabaseCommit (writes). commit(state_changes). Production DBs may persist; tests don't.
DatabaseAsync (async). For non-blocking I/O. AlloyDb (mainnet fork) uses this.
Optimisation 1: caching. Wrap a slow DB (mainnet RPC) in a CacheDB; reads cached after first fetch.
Optimisation 2: with_block_hashes. Pre-fetch the recent block hashes (last 256) at start; avoids per-opcode round trips.
3 real impls. (1) EmptyDB (tests: returns empty for everything). (2) CacheDB<T> (production: wraps any Database). (3) AlloyDB (mainnet fork: uses Alloy Provider).
StateProviderDatabase in Reth. Bridges Reth's MDBX state to revm's Database. Production hot path.

Worked example + steps

Companion traits, optimizations, and real impls

You finished the last lesson holding a four-method Database trait that takes &mut self — and an awkward dangling problem: Arc<MyDb> (Rust's atomic reference-counted pointer, the standard way to share data across threads) only hands out &T, never &mut T. So parallel readers can't share a Database at all. Production needs that, so revm solves it with three more pieces: a read-only companion trait, a separate write-back trait, and one perf escape hatch that lives in the trait API itself. Plus three reference impls that show the same shape stretching from 50 lines to thousands.

Step 1 — `DatabaseRef`: read-only access

#[auto_impl(&, &mut, Box, Rc, Arc)]
pub trait DatabaseRef {
    type Error: DBErrorMarker;
    fn basic_ref(&self, address: Address) -> Result<Option<AccountInfo>, Self::Error>;
    fn code_by_hash_ref(&self, code_hash: B256) -> Result<Bytecode, Self::Error>;
    fn storage_ref(&self, address: Address, index: StorageKey)
        -> Result<StorageValue, Self::Error>;
    fn block_hash_ref(&self, number: u64) -> Result<B256, Self::Error>;
}

Same four methods as Database. Two differences:

&self instead of &mut self. No interior mutation allowed (without RwLock / OnceLock etc.).
auto_impl list is longer — &, &mut, Box, Rc, Arc (five wrappers vs. Database's two).

Because &self access is strictly less restrictive than &mut self. Arc<T> and Rc<T> give you cheap, shareable &T but never &mut T. So DatabaseRef works through them; Database doesn't. The longer list is mechanical, not a design choice.

The pattern: need shared concurrent access? Implement DatabaseRef. Need caching? Implement Database. Need both? Implement both — revm has helpers like WrapDatabaseRef to lift one to the other.

Step 2 — `DatabaseCommit`: separate write-back trait

#[auto_impl(&mut, Box)]
pub trait DatabaseCommit {
    fn commit(&mut self, changes: AddressMap<Account>);
}

A separate trait for write-back. Why?

Two reasons:

Read-only databases exist. A forked-mainnet impl reads from RPC but has no business committing — there's no real backing store to write to. Forcing it to implement commit would require a panicking stub or pollute the type with a bogus method.
Different lifecycle. Reading is per-call; committing is end-of-transaction. Splitting the trait makes that lifecycle explicit and lets the type system enforce it.

Same pattern as Rust's Read and Write in std::io (the standard library's two-trait split for streams) — mixing them into one trait would force every reader to think about writing.

Step 3 — `storage_by_account_id` (the optimization)

Database has one more method we didn't show last lesson:

#[inline]
fn storage_by_account_id(
    &mut self,
    address: Address,
    account_id: AccountId,
    storage_key: StorageKey,
) -> Result<StorageValue, Self::Error> {
    let _ = account_id;
    self.storage(address, storage_key)
}

Note: it has a default implementation that ignores account_id and forwards to storage. That default is the key feature.

For impls with internal account indexing — e.g., MDBX-backed Reth, where the account has been resolved to an internal numeric ID earlier in the call frame. Passing account_id skips a redundant address-to-account-ID lookup on each storage hit. The default forwards safely; impls that can go faster override.

Performance lives in the trait API, not just the implementation. A naive impl (in-memory) takes the default and runs fine. A production impl (MDBX) overrides and gets paid back for the work.

Step 4 — Three real implementations to skim

Same trait, three radically different backends:

Impl	Where	Backing	Lines
`InMemoryDB`	`crates/database/src/in_memory_db.rs`	`HashMap`s	~50
`AlloyDB`	`crates/database/src/alloydb.rs`	JSON-RPC over the network	~150
`StateProviderDatabase`	reth: `crates/storage/storage-api/src/database_provider.rs`	MDBX, sparse Merkle	thousands

🔍 Read all three openings. Just the type definitions and the first method (basic). Compare:

InMemoryDB::basic — direct HashMap::get, infallible

AlloyDB::basic — async RPC call wrapped in a sync façade, fallible

StateProviderDatabase::basic — MDBX cursor lookup, fallible

Three different worlds, one trait shape.

AlloyDB. It fetches state lazily over RPC — no need to download a full archive node. The first time your tx hits a slot or account, AlloyDB queries the upstream node; subsequent reads come from its in-memory cache. The fork-mainnet pattern is exactly 150 lines of glue around Database.

Recall before the quiz

Without scrolling:

Why does DatabaseRef's auto_impl list include Rc and Arc while Database's doesn't?
Why is commit on a separate trait from Database?
What does overriding storage_by_account_id actually save in the MDBX impl?
Among InMemoryDB, AlloyDB, StateProviderDatabase — which would you pick to fork mainnet?

The next lesson is a quiz. Engage with these recalls now if any answer is shaky.

Summary (3 lines)

3 companions: DatabaseRef (read-only) + DatabaseCommit (writes) + DatabaseAsync (non-blocking I/O).
2 optimisations: CacheDB wrapper + with_block_hashes pre-fetch.
3 real impls: EmptyDB / CacheDB / AlloyDB. Reth uses StateProviderDatabase. Next: quiz.