Lesson 9 — OpenHlNode and the first start_engine call
Question
OpenHlNode is the top-level type that wires everything together: ConsensusBridge + Context + SigningProvider + Codec. start_engine is the first call that actually drives Malachite + the bridge end-to-end.
Principle (minimum model)
OpenHlNodestruct. Holds bridge (Arc), signing provider, codec, context. Cloned per-task.start_engine. Spawns Malachite's event loop; passes references to all the components. Returns a handle.Handletype. Lets the caller.awaiton Malachite's lifecycle; gracefully shut down; query state.- Channels. Tokio mpsc + oneshot. Malachite needs an inbox (for received messages) + an outbox (for messages to send).
- Validator key. Loaded at startup; private key stays in process memory; public key registered with ValidatorSet.
- End-to-end test. Boot OpenHlNode with a single validator; assert it produces blocks; assert the bridge's state advances.
- Single-validator BFT. Specially handled: with one validator, no votes needed; proposals immediately commit. Lets us test the pipeline before adding multi-validator complexity.
Worked example + steps
Lesson 9 — OpenHlNode and the first start_engine call
Goal
Concepts you'll grasp in this lesson:
Nodeas handshake interface, not runtime —OpenHlNodeholds long-lived configuration (key, validator set, home dir, moniker) and constructs the engine. The actual running actor system lives inOpenHlNodeHandle, returned fromstart(). Construction and execution are different lifecycle stages, in different types.- The actor-system spawn surface — what
start_engineactually does (spawns ractor cells, binds libp2p, allocates aChannels<OpenHlContext>), why it returns anEngineHandle, and howOpenHlNodeHandlewraps it to satisfy theNodeHandle<OpenHlContext>trait. Mutex<Option<Channels>>take-once semantics — why the channel handle is takeable exactly once. The app loop (Lesson 10) consumes them; subsequent calls returnNone, a clean signal that ownership has transferred.- Centralized address derivation —
SHA-256(pubkey)[12..32]lives in one place (get_address), and a test asserts it matches the helper used in Lesson 6's runner. Centralization + a verification test prevents silent drift across files. - Type-safe placeholders over
todo!()—run()returnsErr("not yet implemented (Lesson 10)")instead of panicking. Code that calls it fails gracefully with a pointer to the next lesson, surviving across PRs and stale tabs. - Why the smoke test is necessary — Lesson 8's compile-time
assert_impl_all!proved the codec satisfies the trait. The smoke test proves the runtime path — spawn, channel allocation, libp2p binding, kill propagation — actually works end-to-end. Types are necessary but not sufficient.
Verification:
cargo test -p openhl-consensus
…passes 20 tests (16 from Lesson 8 + 4 new ones for the Node impl). The capstone test:
test node::tests::start_engine_smoke_spawns_and_kills ... ok
…spawns the full Malachite actor system against your code, asserts the channel handle is available exactly once, and tears the actor system down cleanly — in about 0.02 seconds. After this lesson, the engine boots; the only thing missing is the application loop that consumes from Channels<OpenHlContext> and drives the bridge.
Specific changes:
- 1 dep added to
crates/consensus/Cargo.toml:informalsystems-malachitebft-app-channel. crates/consensus/src/node.rs— new file (~310 lines) withOpenHlNode,OpenHlConfig,OpenHlGenesis,OpenHlPrivateKeyFile,OpenHlNodeHandle, theimpl Node for OpenHlNode(5 associated types, 12 methods), and 4 unit tests (private-key round-trip, config defaults, address derivation,start_enginesmoke).crates/consensus/src/lib.rs— wirespub mod node;.
Recap
After Lesson 8 your openhl-consensus crate has:
crates/consensus/src/lib.rs — pub mod bridge, codec, context, signing, signing_provider, types
crates/consensus/src/codec.rs — OpenHlCodec (1 real + 7 stub Codec impls, 2 tests)
crates/consensus/src/signing_provider.rs — SigningProvider<OpenHlContext>
crates/consensus/src/context.rs — Context<OpenHlContext>
crates/consensus/src/types/ — 7 type files
cargo test -p openhl-consensus passes 16 tests. You've satisfied every trait bound start_engine requires at the type level, but you can't actually call it yet — there's no Node impl, no config, no genesis, no private key file, no node handle.
Plan
Six things:
- Add 5 more deps to
crates/consensus/Cargo.toml—informalsystems-malachitebft-app-channel,informalsystems-malachitebft-config, enableserdefeature on signing-ed25519, addserdeandtokioas runtime deps (not just dev), addtempfileas dev-dep. - Create
crates/consensus/src/node.rswith:OpenHlConfig(implNodeConfig),OpenHlGenesis(unit struct),OpenHlPrivateKeyFile(wire wrapper),OpenHlNodeHandle(returned fromstart()),OpenHlNode(the main struct), andimpl Node for OpenHlNodewith 5 associated types and 12 methods. - Wire
pub mod node;intolib.rs. - Add 4 unit tests to
node.rs. - Run
cargo test -p openhl-consensus— 20 tests pass. - Stare at
start_engine_smoke_spawns_and_killspassing in 0.02 seconds. This is the moment your code becomes a running BFT engine.
This lesson teaches the bridge pattern between your code and Malachite. The engine — written by someone else, generic over your Context and Codec — needs five things to spawn: an instance of your context, an instance of your node (to get config, signing, address derivation), a config value, a codec value, an initial height, and a validator set. The Node trait is the handshake interface that lets Malachite ask your code for those things uniformly. Once you implement it, start_engine works for any chain that follows the same handshake.
Walk-through
Step 1: Update crates/consensus/Cargo.toml
Open crates/consensus/Cargo.toml. The current [dependencies] section (after Lesson 8) looks like:
[dependencies]
openhl-types = { workspace = true }
async-trait = { workspace = true }
thiserror = { workspace = true }
eyre = { workspace = true }
informalsystems-malachitebft-core-types = { workspace = true }
informalsystems-malachitebft-core-driver = { workspace = true }
informalsystems-malachitebft-core-consensus = { workspace = true }
informalsystems-malachitebft-app = { workspace = true }
informalsystems-malachitebft-signing-ed25519 = { workspace = true, features = ["rand"] }
bytes = "1"
rand = "0.8"
sha2 = "0.10"
[dev-dependencies]
tokio = { workspace = true }
Replace it with:
[dependencies]
openhl-types = { workspace = true }
async-trait = { workspace = true }
thiserror = { workspace = true }
eyre = { workspace = true }
informalsystems-malachitebft-core-types = { workspace = true }
informalsystems-malachitebft-core-driver = { workspace = true }
informalsystems-malachitebft-core-consensus = { workspace = true }
informalsystems-malachitebft-app = { workspace = true }
informalsystems-malachitebft-app-channel = { workspace = true }
informalsystems-malachitebft-config = { workspace = true }
informalsystems-malachitebft-signing-ed25519 = { workspace = true, features = ["rand", "serde"] }
bytes = "1"
rand = "0.8"
sha2 = "0.10"
serde = { workspace = true }
tokio = { workspace = true }
[dev-dependencies]
tokio = { workspace = true }
tempfile = "3"
[lints]
workspace = true
What each new dep is for:
informalsystems-malachitebft-app-channel— providesstart_engine(), the function we're about to call, plus theChannels<Ctx>type returned to communicate with the engine.informalsystems-malachitebft-config—ConsensusConfig,ValueSyncConfig,ValuePayloadtypes we'll embed inOpenHlConfig.serdefeature onsigning-ed25519— lets us deriveSerialize/DeserializeonOpenHlPrivateKeyFile, which needs thePrivateKeynewtype to be serializable.serde(runtime dep) — used byOpenHlConfig,OpenHlGenesis,OpenHlPrivateKeyFilefor#[derive(Serialize, Deserialize)].tokiomoved from dev-dep to dep —OpenHlNodeHandleholds atokio::sync::Mutex.tempfiledev-dep — the smoke test creates a temp directory for the node's home dir.
This is your second heavy compile. First time pulling in app-channel + config will take ~20 more seconds.
Step 2: Create crates/consensus/src/node.rs — imports and OpenHlConfig
Start with imports:
//! `Node` trait implementation — describes our chain to Malachite's engine
//! and provides the [`OpenHlNode::start`] entry point that calls
//! `malachitebft_app_channel::start_engine` to spawn the actor system.
use std::path::PathBuf;
use async_trait::async_trait;
use eyre::eyre;
use informalsystems_malachitebft_app::node::{EngineHandle, Node, NodeConfig, NodeHandle};
use informalsystems_malachitebft_app::types::Keypair;
use informalsystems_malachitebft_app_channel::Channels;
use informalsystems_malachitebft_config::{ConsensusConfig, ValueSyncConfig, ValuePayload};
use informalsystems_malachitebft_core_types::Height as _;
use informalsystems_malachitebft_signing_ed25519::{PrivateKey, PublicKey};
use serde::{Deserialize, Serialize};
use sha2::{Digest, Sha256};
use tokio::sync::Mutex;
use crate::codec::OpenHlCodec;
use crate::context::OpenHlContext;
use crate::signing_provider::OpenHlSigningProvider;
use crate::types::{OpenHlAddress, OpenHlHeight, OpenHlValidatorSet};
That's the full surface this file needs. Worth scanning once: Node, NodeConfig, NodeHandle are the three Malachite traits we'll implement. EngineHandle + Channels are what start_engine returns. ConsensusConfig + ValueSyncConfig + ValuePayload are the config types embedded in our OpenHlConfig. Keypair is libp2p's keypair type. PrivateKey/PublicKey are the Ed25519 types we've used since Lesson 7. Sha256 is for address derivation.
Now write OpenHlConfig:
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct OpenHlConfig {
pub moniker: String,
#[serde(flatten)]
pub consensus: ConsensusConfig,
pub value_sync: ValueSyncConfig,
}
impl OpenHlConfig {
#[must_use]
pub fn new(moniker: impl Into<String>) -> Self {
// OpenHL runs ProposalOnly (no streaming proposal parts) — must match
// our `Context::ProposalPart` shape.
let consensus = ConsensusConfig {
value_payload: ValuePayload::ProposalOnly,
..ConsensusConfig::default()
};
Self {
moniker: moniker.into(),
consensus,
value_sync: ValueSyncConfig::default(),
}
}
}
impl NodeConfig for OpenHlConfig {
fn moniker(&self) -> &str {
&self.moniker
}
fn consensus(&self) -> &ConsensusConfig {
&self.consensus
}
fn value_sync(&self) -> &ValueSyncConfig {
&self.value_sync
}
}
Three pieces:
- The struct wraps
ConsensusConfig+ValueSyncConfigand adds amoniker(validator's nickname for logs).#[serde(flatten)]onconsensusmeans the consensus fields are inlined into the parent — when serialized to disk, the user sees[consensus]section keys at the top level, not nested underconsensus.. new()enforces one critical choice:value_payload: ValuePayload::ProposalOnly. This must match ourContext::ProposalPart = OpenHlProposalPart(the unit struct). If we accidentally set this toValuePayload::PartsOnly, the engine would expect streamed proposal parts, and our unit-structProposalPartwould never satisfy what the engine sends. This is the kind of invariant that's easier to enforce at construction than to debug later.NodeConfigimpl is three trivial accessors. The trait exists so Malachite can pull out the sub-configs without knowing the parent's layout.
Step 3: OpenHlGenesis and OpenHlPrivateKeyFile
Next:
/// Genesis is a unit struct at v0 — the validator set is passed directly to
/// `start_engine` rather than read from disk. When `OpenHL` grows a real
/// on-disk genesis format this becomes the `load_genesis()` return.
#[derive(Clone, Debug, Default, Serialize, Deserialize)]
pub struct OpenHlGenesis;
/// Wire-friendly wrapper around the raw 32-byte Ed25519 private key.
#[derive(Clone, Serialize, Deserialize)]
pub struct OpenHlPrivateKeyFile {
pub bytes: [u8; 32],
}
impl OpenHlPrivateKeyFile {
#[must_use]
pub fn from_private_key(sk: &PrivateKey) -> Self {
Self {
bytes: sk.inner().to_bytes(),
}
}
#[must_use]
pub fn into_private_key(self) -> PrivateKey {
PrivateKey::from(self.bytes)
}
}
impl std::fmt::Debug for OpenHlPrivateKeyFile {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("OpenHlPrivateKeyFile")
.field("bytes", &"[redacted]")
.finish()
}
}
Two types:
OpenHlGenesis— a unit struct. At v0 we have no genesis content (no allocations, no precompiles registered at boot — those come later in Module 6). The validator set is passed directly tostart_enginerather than via genesis. When OpenHL adds a real genesis format, this becomes the type thatload_genesis()deserializes.OpenHlPrivateKeyFile— a wire-friendly wrapper around the 32-byte private key.PrivateKeyitself (frommalachitebft_signing_ed25519) doesn't implementSerialize/Deserializeby default; the wrapper does, and the conversionsfrom_private_key/into_private_keyare explicit. The manualDebugimpl redacts the bytes —{:?}printing the actual private key in a log would be a serious security bug. The[redacted]token is the convention.
The relationship between OpenHlNode and OpenHlNodeHandle in one diagram makes the central design choice of this lesson — separating construction (static config) from execution (dynamic actor system) into two distinct types — immediately intuitive:
┌─────────────────────────────────────────────────────────────────────────┐
│ ◆ Lifecycle 1: static config / construction (Node) │
│ │
│ OpenHlNode { │
│ private_key, validator_set, │
│ home_dir, moniker, … │
│ } │
│ │
│ • Created once at process start, long-lived │
│ • Engine **is not running yet** (just config in hand) │
└────────────────────────────┬────────────────────────────────────────────┘
│
│ .start().await ◄── handshake (Node trait)
│ executes (Step 5)
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ ◆ Lifecycle 2: dynamic execution / actor system (Handle) │
│ │
│ OpenHlNodeHandle { │
│ engine : EngineHandle ──► ractor cell + libp2p running │
│ channels : Mutex<Option<Channels<OpenHlContext>>> │
│ ──► Lesson 10's app loop pulls it out │
│ exactly once via `take()` │
│ } │
│ │
│ • Returned by `start()`; lives until `.kill().await` │
│ • Ownership flows Node → Handle → app loop in one direction │
└─────────────────────────────────────────────────────────────────────────┘
Three things this picture pins down: (a) OpenHlNode only holds config — it doesn't own an actor system — calling start() is what spins up any threads at all. (b) OpenHlNodeHandle owns both the running actor system and the comm channels — the engine and libp2p lifetimes are bound to this handle. (c) Mutex<Option<Channels<...>>> is a one-way ownership gate — once take() hands it to Lesson 10's app loop, it can never be reclaimed, and "already consumed" is expressed at the type level as None. Lesson 9's run() method returns an "unimplemented" error precisely because the (c) consumer side (Lesson 10's app loop) hasn't been written yet.
Step 4: OpenHlNodeHandle — what start() returns
/// Handle returned by [`OpenHlNode::start`]. Owns the engine actor system
/// and the channel handles for the (yet-to-be-implemented) app loop.
pub struct OpenHlNodeHandle {
engine: EngineHandle,
channels: Mutex<Option<Channels<OpenHlContext>>>,
}
impl std::fmt::Debug for OpenHlNodeHandle {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("OpenHlNodeHandle")
.field("engine", &"<EngineHandle>")
.field("channels", &"<Channels>")
.finish()
}
}
impl OpenHlNodeHandle {
/// Take ownership of the engine→app message channels. Returns None on
/// the second call. Lesson 10 will consume from this to drive the bridge.
pub async fn take_channels(&self) -> Option<Channels<OpenHlContext>> {
self.channels.lock().await.take()
}
}
#[async_trait]
impl NodeHandle<OpenHlContext> for OpenHlNodeHandle {
fn subscribe(&self) -> informalsystems_malachitebft_app::events::RxEvent<OpenHlContext> {
// No event subscription in Stage 6c — caller can't yet observe engine
// events. Lesson 10 wires the TxEvent from the engine to here.
informalsystems_malachitebft_app::events::TxEvent::new().subscribe()
}
async fn kill(&self, _reason: Option<String>) -> eyre::Result<()> {
self.engine.actor.kill_and_wait(None).await?;
self.engine.handle.abort();
Ok(())
}
}
The handle owns two things:
engine: EngineHandle— Malachite's handle to the spawned actor system. Has anactor(the ractorActorCell) and ahandle(the tokio task handle).kill()cleanly tears both down.channels: Mutex<Option<Channels<OpenHlContext>>>— the application-side endpoints. The engine sendsAppMsg<OpenHlContext>to us; we sendAppReply<OpenHlContext>back.Mutex<Option<...>>so thattake_channels()can hand them to the app loop exactly once — second call returnsNone, signaling "you've already consumed these."
Why tokio::sync::Mutex rather than std::sync::Mutex? Because take_channels() is async and the lock is held across an .await boundary. std::sync::Mutex would block the entire executor thread; tokio::sync::Mutex yields cooperatively.
The NodeHandle impl is mostly placeholder at this stage:
subscribe()returns a freshTxEvent::subscribe()— an empty event stream with no producer attached. Lesson 10 will wire up the real one.kill()is real — it kills the actor cell and aborts the tokio task. This is whatstart_engine_smoke_spawns_and_killsexercises.
Step 5: OpenHlNode struct + Node impl
#[derive(Clone, Debug)]
pub struct OpenHlNode {
pub private_key: PrivateKey,
pub validator_set: OpenHlValidatorSet,
pub home_dir: PathBuf,
pub moniker: String,
}
impl OpenHlNode {
#[must_use]
pub fn new(
private_key: PrivateKey,
validator_set: OpenHlValidatorSet,
home_dir: PathBuf,
moniker: impl Into<String>,
) -> Self {
Self {
private_key,
validator_set,
home_dir,
moniker: moniker.into(),
}
}
}
#[async_trait]
impl Node for OpenHlNode {
type Context = OpenHlContext;
type Config = OpenHlConfig;
type Genesis = OpenHlGenesis;
type PrivateKeyFile = OpenHlPrivateKeyFile;
type SigningProvider = OpenHlSigningProvider;
type NodeHandle = OpenHlNodeHandle;
fn get_home_dir(&self) -> PathBuf {
self.home_dir.clone()
}
fn load_config(&self) -> eyre::Result<Self::Config> {
let mut cfg = OpenHlConfig::new(&self.moniker);
// Bind to an ephemeral port on localhost so tests and devnets don't
// step on each other. Real deployments override this in their config.
cfg.consensus.p2p.listen_addr = "/ip4/127.0.0.1/tcp/0"
.parse()
.map_err(|e| eyre!("invalid listen_addr: {e}"))?;
Ok(cfg)
}
fn get_address(&self, pk: &PublicKey) -> OpenHlAddress {
let digest = Sha256::digest(pk.as_bytes());
let mut addr = [0u8; 20];
addr.copy_from_slice(&digest[12..32]);
OpenHlAddress(addr)
}
fn get_public_key(&self, pk: &PrivateKey) -> PublicKey {
pk.public_key()
}
fn get_keypair(&self, pk: PrivateKey) -> Keypair {
Keypair::ed25519_from_bytes(pk.inner().to_bytes())
.expect("ed25519 private key is always 32 bytes")
}
fn load_private_key(&self, file: Self::PrivateKeyFile) -> PrivateKey {
file.into_private_key()
}
fn load_private_key_file(&self) -> eyre::Result<Self::PrivateKeyFile> {
Ok(OpenHlPrivateKeyFile::from_private_key(&self.private_key))
}
fn load_genesis(&self) -> eyre::Result<Self::Genesis> {
// Validator set is passed directly to start_engine; genesis carries
// nothing else at v0.
Ok(OpenHlGenesis)
}
fn get_signing_provider(&self, private_key: PrivateKey) -> Self::SigningProvider {
OpenHlSigningProvider::new(private_key)
}
async fn start(&self) -> eyre::Result<Self::NodeHandle> {
let cfg = self.load_config()?;
let validator_set = self.validator_set.clone();
let (channels, engine) = informalsystems_malachitebft_app_channel::start_engine(
OpenHlContext,
self.clone(),
cfg,
OpenHlCodec, // WAL
OpenHlCodec, // Network
Some(OpenHlHeight::INITIAL),
validator_set,
)
.await?;
Ok(OpenHlNodeHandle {
engine,
channels: Mutex::new(Some(channels)),
})
}
async fn run(self) -> eyre::Result<()> {
// Lesson 10 will consume from channels here and run the app loop.
Err(eyre!("OpenHlNode::run is not yet implemented (Lesson 10)"))
}
}
This is the load-bearing block. Walk through:
The struct carries four things: private key, validator set, home dir, moniker. These are the long-lived bits that don't change per-config-reload.
The 6 associated types declare the concrete types for each handshake slot:
Context = OpenHlContext— what Malachite uses to typecheck everything elseConfig = OpenHlConfig— whatload_config()returnsGenesis = OpenHlGenesis— whatload_genesis()returnsPrivateKeyFile = OpenHlPrivateKeyFile— whatload_private_key_file()returnsSigningProvider = OpenHlSigningProvider— whatget_signing_provider()returnsNodeHandle = OpenHlNodeHandle— whatstart()returns
The 12 methods:
| Method | Purpose | Body |
|---|---|---|
get_home_dir | Where the node stores its data | Returns the path passed at construction |
load_config | Build the config (re-callable) | Constructs OpenHlConfig, then overrides the listen address to ephemeral local |
get_address | SHA-256 hash → 20-byte address | Last 20 of the 32-byte digest |
get_public_key | PK from SK | sk.public_key() |
get_keypair | libp2p Keypair from Ed25519 | Convert via ed25519_from_bytes |
load_private_key | Unwrap the file format | file.into_private_key() |
load_private_key_file | Serialize PK to file format | OpenHlPrivateKeyFile::from_private_key(...) |
load_genesis | Read the genesis | Returns OpenHlGenesis (unit struct, nothing to read) |
get_signing_provider | Construct the SigningProvider | OpenHlSigningProvider::new(pk) |
start | Spawn the engine | Calls start_engine with 7 args, wraps return in OpenHlNodeHandle |
run | Run the app loop | Unimplemented at Lesson 9 — returns error pointing to Lesson 10 |
The start() method is the highlight. It calls start_engine with:
- the context (
OpenHlContext— a unit struct) - the node itself (
self.clone()) - the config (
cfg) - two codec values (one for WAL, one for Network — both
OpenHlCodec) - the initial height (
Some(OpenHlHeight::INITIAL)) - the validator set (
validator_set)
What start_engine returns: (Channels<OpenHlContext>, EngineHandle). We wrap these into OpenHlNodeHandle and return.
Why is run() unimplemented? Because Malachite's Node::run is meant to combine start() with the app loop into one async future. Since the app loop doesn't exist until Lesson 10, we return an error pointing to Lesson 10. Once Lesson 10 is done, run() will look like: call start(), take the channels, drive the app loop, await termination.
Step 6: Wire node.rs into lib.rs
//! Consensus layer — Malachite BFT.
pub mod bridge;
pub mod codec;
pub mod context;
pub mod node;
pub mod signing;
pub mod signing_provider;
pub mod types;
pub use context::OpenHlContext;
Step 7: Add 4 unit tests
At the bottom of node.rs:
#[cfg(test)]
mod tests {
use super::*;
use crate::types::OpenHlValidator;
use rand::rngs::OsRng;
fn single_validator_node(home_dir: PathBuf) -> OpenHlNode {
let sk = PrivateKey::generate(OsRng);
let pk = sk.public_key();
let digest = Sha256::digest(pk.as_bytes());
let mut addr_bytes = [0u8; 20];
addr_bytes.copy_from_slice(&digest[12..32]);
let address = OpenHlAddress(addr_bytes);
let validator_set = OpenHlValidatorSet::new(vec![OpenHlValidator::new(address, pk, 1)]);
OpenHlNode::new(sk, validator_set, home_dir, "openhl-test")
}
#[test]
fn private_key_file_round_trips() {
let sk = PrivateKey::generate(OsRng);
let file = OpenHlPrivateKeyFile::from_private_key(&sk);
let restored = file.into_private_key();
assert_eq!(restored.inner().to_bytes(), sk.inner().to_bytes());
}
#[test]
fn load_config_sets_proposal_only_payload_and_ephemeral_listen_addr() {
let tmp = tempfile::tempdir().unwrap();
let node = single_validator_node(tmp.path().to_path_buf());
let cfg = node.load_config().unwrap();
assert_eq!(cfg.consensus.value_payload, ValuePayload::ProposalOnly);
// listen_addr should be /ip4/127.0.0.1/tcp/0 (ephemeral)
let listen_str = cfg.consensus.p2p.listen_addr.to_string();
assert!(
listen_str.starts_with("/ip4/127.0.0.1/tcp/0"),
"unexpected listen_addr: {listen_str}"
);
}
#[test]
fn get_address_matches_runner_derivation() {
let tmp = tempfile::tempdir().unwrap();
let node = single_validator_node(tmp.path().to_path_buf());
let pk = node.private_key.public_key();
let addr1 = node.get_address(&pk);
// Same derivation as runner.rs (last 20 bytes of SHA-256(pubkey)).
let digest = Sha256::digest(pk.as_bytes());
let mut expected = [0u8; 20];
expected.copy_from_slice(&digest[12..32]);
assert_eq!(addr1, OpenHlAddress(expected));
}
/// Smoke test: spin up the actor system, get a handle back, kill cleanly.
/// Does NOT drive consensus — that's Lesson 10.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn start_engine_smoke_spawns_and_kills() {
let tmp = tempfile::tempdir().unwrap();
let node = single_validator_node(tmp.path().to_path_buf());
let handle = match node.start().await {
Ok(h) => h,
Err(e) => panic!("start_engine failed: {e:?}"),
};
// Sanity-poke the channels handle is available exactly once.
assert!(handle.take_channels().await.is_some());
assert!(handle.take_channels().await.is_none());
handle.kill(None).await.unwrap();
}
}
Four tests:
private_key_file_round_trips— generate a key, wrap inOpenHlPrivateKeyFile, unwrap, assert byte-equality. Proves the wire format is lossless.load_config_sets_proposal_only_payload_and_ephemeral_listen_addr— construct a node, callload_config(), verify two things:value_payload == ProposalOnly(the invariant we enforce at construction) andlisten_addris the ephemeral local socket. Catches accidental config drift.get_address_matches_runner_derivation— derive the same address two ways (once via the trait method, once by inlining the SHA-256 logic). Asserts they match. Catches accidental drift if someone changes one without the other.start_engine_smoke_spawns_and_kills— the capstone. Uses#[tokio::test(flavor = "multi_thread", worker_threads = 2)]because the engine needs the multi-threaded runtime (it spawns multiple actors). Steps: construct a single-validator node, callnode.start().await, poke the channels handle (onceSome, second timeNone), callkill(). If this passes, your code is now a running BFT engine.
The smoke test is roughly 0.02 seconds wall-clock. The bulk is libp2p setting up the local listener — even on a tcp/0 ephemeral port, libp2p's negotiation has a fixed cost.
Test
cargo test -p openhl-consensus
After ~20 seconds (first compile after the dep changes):
running 20 tests
test codec::tests::openhl_codec_satisfies_all_three_super_traits ... ok
test codec::tests::proposal_part_round_trips ... ok
test context::tests::height_increment_and_decrement ... ok
test context::tests::new_prevote_and_precommit_have_distinct_types ... ok
test context::tests::new_proposal_round_trips_fields ... ok
test context::tests::select_proposer_round_robins_deterministically ... ok
test context::tests::validator_set_is_sorted_by_power_then_address ... ok
test node::tests::get_address_matches_runner_derivation ... ok
test node::tests::load_config_sets_proposal_only_payload_and_ephemeral_listen_addr ... ok
test node::tests::private_key_file_round_trips ... ok
test signing::tests::vote_signature_is_field_sensitive ... ok
test signing::tests::vote_signature_round_trips ... ok
test signing_provider::tests::proposal_part_sign_verify_round_trips ... ok
test signing_provider::tests::proposal_sign_verify_round_trips ... ok
test signing_provider::tests::proposal_tamper_detected ... ok
test signing_provider::tests::signature_from_one_provider_does_not_verify_under_another ... ok
test signing_provider::tests::vote_extension_sign_verify_round_trips ... ok
test signing_provider::tests::vote_sign_verify_round_trips ... ok
test signing_provider::tests::vote_tamper_detected ... ok
test node::tests::start_engine_smoke_spawns_and_kills ... ok
test result: ok. 20 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
The smoke test runs last because of the multi-thread runtime setup.
Common errors and fixes:
error[E0432]: unresolved import 'informalsystems_malachitebft_app_channel'— Cargo.toml doesn't haveapp-channel. Re-check Step 1.error[E0277]: PrivateKey: Deserialize is not satisfied— missingserdefeature onsigning-ed25519. Re-check Step 1 (features = ["rand", "serde"]).- smoke test hangs forever — usually
flavor = "current_thread"(default for#[tokio::test]). Re-check Step 7: the attribute must be#[tokio::test(flavor = "multi_thread", worker_threads = 2)]. error: Keypair::ed25519_from_bytes expected mutable bytes— version mismatch. The libp2pKeypair::ed25519_from_bytessignature changed across versions; the workspace pin should align with whatinformalsystems-malachitebft-appre-exports.Address derivation does not match— yourget_addressdoesn't match the helper in the test. Both must use the last 20 bytes ofSHA-256(pubkey)— slice[12..32].
Design reflection
Three load-bearing decisions encoded here:
-
OpenHlNodeis the handshake interface, not the runtime. The struct holds long-lived fields (key, validator set, home dir, moniker). It doesn't run the chain. The runtime lives inOpenHlNodeHandle(engine + channels), returned fromstart(). Construction and execution are different lifecycle stages, so they live in different types. -
Address derivation is centralized in
get_address. When you usedSHA-256(pubkey)[12..32]in the runner back in Lesson 6 setup-code, that was the same derivation. The testget_address_matches_runner_derivationasserts they're identical, so future refactors can't silently drift one without the other. Centralization with a verification test beats duplication every time. -
run()returns an error pointing at the next lesson. Rather thanunimplemented!()(panics) ortodo!()(also panics), aneyre::Result::Err("not yet implemented (Lesson 10)")is a type-safe placeholder. Code that callsrun()gets a graceful failure with a message pointing at where to look. This is the kind of crumb that survives across pull requests, code reviews, and stale tabs.
Answer key
cd ~/code/openhl-reference
git checkout d59d6cf
diff -u ~/code/my-openhl/crates/consensus/src/node.rs ./crates/consensus/src/node.rs
diff -u ~/code/my-openhl/crates/consensus/Cargo.toml ./crates/consensus/Cargo.toml
diff -u ~/code/my-openhl/crates/consensus/src/lib.rs ./crates/consensus/src/lib.rs
The reference at d59d6cf includes 310 lines of node.rs. The Node impl methods (12 total), the struct layouts, and the smoke test should match closely. Doc comments and exact wording can vary.
Return:
git checkout main
Common questions
Q: Why does start_engine need both the node and the validator set when the validator set is already inside the node?
Because the engine doesn't reach into the node's internals. The node has many fields (path, moniker, key, etc.) that are not relevant to validator-set election. start_engine accepts the validator set explicitly so the engine doesn't need to know about your node's specific field layout. This is the same separation-of-concerns principle as Node::load_config().
Q: What does the smoke test prove that the compile-time assertions don't?
The compile-time assertions in Lesson 8 proved OpenHlCodec: WalCodec + ConsensusCodec + SyncCodec. The smoke test proves that the runtime path — actor spawning, channel allocation, libp2p binding, kill propagation — actually works end-to-end. Type-safety is necessary but not sufficient; the test catches things like "spawn deadlocks" or "the engine panics on first message" that types can't catch.
Q: What's the difference between EngineHandle and NodeHandle?
EngineHandle (from Malachite) is the low-level handle to the spawned actor system — actor cell, tokio task handle. NodeHandle (your trait) is the high-level abstraction Malachite uses to ask "is this still alive? subscribe me to events. kill it." Your OpenHlNodeHandle impls NodeHandle<OpenHlContext> and internally holds the EngineHandle. Two layers; you only deal with one.
Q: Why does take_channels use Option<Channels<...>> instead of just removing the channels?
Because take_channels is called from the outside — the app loop wants to consume them. Removing them entirely would require either a mutable reference or moving the handle. Mutex<Option<...>> lets the app loop call it via shared reference (&self), grab the channels once, and find None on subsequent calls — a clean signal "you already took these."
Next lesson (Lesson 10)
You now have the engine running. But — critically — the engine is sending you messages and you're ignoring them. The actor system is parked, waiting for the app loop to consume from Channels<OpenHlContext> and respond to AppMsg::ProposeValue, AppMsg::Decided, etc. Lesson 10 implements the app loop: a tokio::select over the channel + a state struct + handlers that route engine messages to InMemoryEvmBridge. When Lesson 10 ships, cargo test first_block_via_engine_actors produces an actual block through the full engine pipeline.
Summary (3 lines)
OpenHlNode= top-level glue. Bridge + Context + Signing + Codec.start_enginespawns Malachite's event loop.- Tokio mpsc + oneshot channels for in/out messages. Single-validator BFT simplifies first test.
- End-to-end test boots OpenHlNode; produces blocks; state advances. Next module: engine integration.