Lesson 3 — Tokio runtime internals

Question

Reth runs on Tokio. Understanding Tokio's internals = understanding Reth's parallelism. Work-stealing scheduler + reactors.

Principle (minimum model)

Tokio runtime structure. N worker threads (CPU-bound) + 1 reactor thread (I/O-bound). Default to CPUs.
Work-stealing scheduler. Each worker has a queue; idle workers steal from busy ones. Balances load automatically.
Async tasks vs blocking. Async = green-thread on worker pool; blocking = tokio::task::spawn_blocking on separate thread pool. Don't mix.
Reth conventions. Per-component async; CPU-bound (revm exec) wrapped in spawn_blocking. Pattern transfers.
Reactor. Polls epoll/kqueue for I/O readiness; wakes appropriate task. Single thread; high throughput.
Pitfalls. Holding std Mutex across .await = deadlock. Use tokio::sync::Mutex for async-safe.
Performance. Tokio's scheduler overhead is ~1 µs per task switch. Compare to OS threads (~10 µs). Wins on high-task-count.

Worked example + steps

Tokio runtime internals

You've been writing #[tokio::main] and sprinkling .await over every async call. Reth has 200+ of those scattered through its codebase, and at peak load it handles thousands of concurrent peer connections plus dozens of background tasks on 8 worker threads. No magic; just a state machine the compiler wrote for you, a work-stealing scheduler, and an epoll loop. This lesson is what's underneath the .await.

1. The runtime stack

Tokio is composed of:

+--------------------+
|  Your async code   |  ← futures
+--------------------+
|     Executor       |  ← polls futures to completion
|  (work-stealing)   |
+--------------------+
|       I/O          |  ← epoll/kqueue/io_uring
+--------------------+

When you write async fn, the compiler generates a state machine that implements the Future trait. The executor's job is to call poll() on that state machine until it returns Poll::Ready(value).

2. Work-stealing in 60 seconds

The problem: 8 worker threads, thousands of tasks. How do you distribute them without all 8 threads fighting over one shared queue? Tokio's answer: give each worker its own local queue (cheap, no contention), plus a fallback global queue. When a worker runs dry, it steals tasks from a busy neighbor's queue.

Worker A: [task1, task2, task3, task4]   ← busy
Worker B: []                              ← idle, steals from A
Worker A: [task1, task2]
Worker B: [task3, task4]

This avoids contention on a global mutex while still balancing load.

3. Spawning vs blocking

// Concurrent: spawn onto the runtime
let h1 = tokio::spawn(async { fetch().await });
let h2 = tokio::spawn(async { fetch().await });
let (r1, r2) = (h1.await?, h2.await?);

// CPU-heavy work: keep it OFF the async workers
tokio::task::spawn_blocking(|| {
    expensive_sync_calc()  // runs on a separate threadpool
}).await?

Rule: never call CPU-bound code in an async context without spawn_blocking. You'll starve the runtime and the whole node grinds.

4. Channels — picking the right one

Channel	Use
`tokio::sync::mpsc`	many producers, one consumer
`tokio::sync::broadcast`	one producer, many consumers (e.g., chain events)
`tokio::sync::watch`	latest-value broadcast (e.g., latest block)
`tokio::sync::oneshot`	a single value, request-response

ExEx uses broadcast for chain notifications because every ExEx wants every event.

5. Custom executors / Future polling

Eventually you'll want to poll a Future manually:

use std::pin::Pin;
use std::task::{Context, Poll, Waker};
use std::future::Future;

let mut fut = Box::pin(my_async_fn());
let waker = Waker::noop();
let mut cx = Context::from_waker(&waker);

match fut.as_mut().poll(&mut cx) {
    Poll::Ready(v) => /* done */,
    Poll::Pending => /* not yet — re-poll later when Waker is signaled */,
}

This is the foundation of writing your own Reth-internal scheduler — useful for, say, batching MEV simulations.

6. How Reth uses Tokio in production

Reth doesn't expose Tokio directly — it wraps it in a TaskExecutor that adds panic supervision. From crates/tasks/src/runtime.rs:

pub fn spawn_task<F>(&self, fut: F) -> JoinHandle<()>
where
    F: Future<Output = ()> + Send + 'static,

pub fn spawn_critical_task<F>(&self, name: &'static str, fut: F) -> JoinHandle<()>
where
    F: Future<Output = ()> + Send + 'static,

Two flavors:

spawn_task — fire and forget. If it panics, the panic is silently lost (Tokio default).
spawn_critical_task — registered with a name; if it panics, a TaskManager channel fires and the whole node shuts down with the task's name in the log.

This is real production discipline: you don't want a silently-dead background task to leave your node running in a degraded state. Critical tasks fail loudly.

The TaskExecutor = Runtime alias lets you pass it through stage code without dragging in raw Tokio types — clean abstraction with the safety net underneath.

7. Reading list

tokio/tokio/src/runtime/scheduler/multi_thread_alt — the modern multi-thread scheduler
reth/crates/tasks/src/runtime.rs — Reth's task supervisor wrapping Tokio

Final check: in one sentence, what's the difference between async fn and fn as Rust types? If your answer is "one returns a Future and one doesn't," go deeper — what is a Future, structurally? The lesson isn't done with you until "Tokio is magic" becomes "Tokio polls compiler-generated state machines on a work-stealing scheduler."

Summary (3 lines)

Tokio = N work-stealing workers + 1 reactor. Worker = async tasks; reactor = epoll/kqueue I/O.
CPU-bound code → spawn_blocking; async → worker pool. Std Mutex across await = deadlock.
Reth conventions: per-component async, revm exec wrapped in spawn_blocking. ~1 µs task switch.