Lesson 10 — Companion traits, optimizations, and real impls
Question
Database has 3 companion traits + 2 optimisations + 3 real implementations. Read each.
Principle (minimum model)
DatabaseRef(read-only). Same methods asDatabasebut&self(immutable). Lets multiple readers share via Arc.DatabaseCommit(writes).commit(state_changes). Production DBs may persist; tests don't.DatabaseAsync(async). For non-blocking I/O.AlloyDb(mainnet fork) uses this.- Optimisation 1: caching. Wrap a slow DB (mainnet RPC) in a CacheDB; reads cached after first fetch.
- Optimisation 2:
with_block_hashes. Pre-fetch the recent block hashes (last 256) at start; avoids per-opcode round trips. - 3 real impls. (1)
EmptyDB(tests: returns empty for everything). (2)CacheDB<T>(production: wraps any Database). (3)AlloyDB(mainnet fork: uses Alloy Provider). StateProviderDatabasein Reth. Bridges Reth's MDBX state to revm's Database. Production hot path.
Worked example + steps
Companion traits, optimizations, and real impls
You finished the last lesson holding a four-method Database trait that takes &mut self — and an awkward dangling problem: Arc<MyDb> (Rust's atomic reference-counted pointer, the standard way to share data across threads) only hands out &T, never &mut T. So parallel readers can't share a Database at all. Production needs that, so revm solves it with three more pieces: a read-only companion trait, a separate write-back trait, and one perf escape hatch that lives in the trait API itself. Plus three reference impls that show the same shape stretching from 50 lines to thousands.
Step 1 — DatabaseRef: read-only access
#[auto_impl(&, &mut, Box, Rc, Arc)]
pub trait DatabaseRef {
type Error: DBErrorMarker;
fn basic_ref(&self, address: Address) -> Result<Option<AccountInfo>, Self::Error>;
fn code_by_hash_ref(&self, code_hash: B256) -> Result<Bytecode, Self::Error>;
fn storage_ref(&self, address: Address, index: StorageKey)
-> Result<StorageValue, Self::Error>;
fn block_hash_ref(&self, number: u64) -> Result<B256, Self::Error>;
}
Same four methods as Database. Two differences:
&selfinstead of&mut self. No interior mutation allowed (withoutRwLock/OnceLocketc.).auto_impllist is longer —&, &mut, Box, Rc, Arc(five wrappers vs.Database's two).
Because &self access is strictly less restrictive than &mut self. Arc<T> and Rc<T> give you cheap, shareable &T but never &mut T. So DatabaseRef works through them; Database doesn't. The longer list is mechanical, not a design choice.
The pattern: need shared concurrent access? Implement DatabaseRef. Need caching? Implement Database. Need both? Implement both — revm has helpers like WrapDatabaseRef to lift one to the other.
Step 2 — DatabaseCommit: separate write-back trait
#[auto_impl(&mut, Box)]
pub trait DatabaseCommit {
fn commit(&mut self, changes: AddressMap<Account>);
}
A separate trait for write-back. Why?
Two reasons:
- Read-only databases exist. A forked-mainnet impl reads from RPC but has no business committing — there's no real backing store to write to. Forcing it to implement
commitwould require a panicking stub or pollute the type with a bogus method. - Different lifecycle. Reading is per-call; committing is end-of-transaction. Splitting the trait makes that lifecycle explicit and lets the type system enforce it.
Same pattern as Rust's Read and Write in std::io (the standard library's two-trait split for streams) — mixing them into one trait would force every reader to think about writing.
Step 3 — storage_by_account_id (the optimization)
Database has one more method we didn't show last lesson:
#[inline]
fn storage_by_account_id(
&mut self,
address: Address,
account_id: AccountId,
storage_key: StorageKey,
) -> Result<StorageValue, Self::Error> {
let _ = account_id;
self.storage(address, storage_key)
}
Note: it has a default implementation that ignores account_id and forwards to storage. That default is the key feature.
For impls with internal account indexing — e.g., MDBX-backed Reth, where the account has been resolved to an internal numeric ID earlier in the call frame. Passing account_id skips a redundant address-to-account-ID lookup on each storage hit. The default forwards safely; impls that can go faster override.
Performance lives in the trait API, not just the implementation. A naive impl (in-memory) takes the default and runs fine. A production impl (MDBX) overrides and gets paid back for the work.
Step 4 — Three real implementations to skim
Same trait, three radically different backends:
| Impl | Where | Backing | Lines |
|---|---|---|---|
InMemoryDB | crates/database/src/in_memory_db.rs | HashMaps | ~50 |
AlloyDB | crates/database/src/alloydb.rs | JSON-RPC over the network | ~150 |
StateProviderDatabase | reth: crates/storage/storage-api/src/database_provider.rs | MDBX, sparse Merkle | thousands |
🔍 Read all three openings. Just the type definitions and the first method (
basic). Compare:
InMemoryDB::basic— directHashMap::get, infallibleAlloyDB::basic— async RPC call wrapped in a sync façade, fallibleStateProviderDatabase::basic— MDBX cursor lookup, fallibleThree different worlds, one trait shape.
AlloyDB. It fetches state lazily over RPC — no need to download a full archive node. The first time your tx hits a slot or account, AlloyDB queries the upstream node; subsequent reads come from its in-memory cache. The fork-mainnet pattern is exactly 150 lines of glue around Database.
Recall before the quiz
Without scrolling:
- Why does
DatabaseRef'sauto_impllist includeRcandArcwhileDatabase's doesn't? - Why is
commiton a separate trait fromDatabase? - What does overriding
storage_by_account_idactually save in the MDBX impl? - Among
InMemoryDB,AlloyDB,StateProviderDatabase— which would you pick to fork mainnet?
The next lesson is a quiz. Engage with these recalls now if any answer is shaky.
Summary (3 lines)
- 3 companions: DatabaseRef (read-only) + DatabaseCommit (writes) + DatabaseAsync (non-blocking I/O).
- 2 optimisations: CacheDB wrapper + with_block_hashes pre-fetch.
- 3 real impls: EmptyDB / CacheDB / AlloyDB. Reth uses StateProviderDatabase. Next: quiz.