BYTES AND BRAINS v0.3.0 . CRATES.IO
UPLINK OK
guide/10 . the-engine // SOURCE: bytesandbrains/docs/ENGINE.md + bytesandbrains/docs/CONTRACT_DISPATCH.md
Chapter 10

The Engine

Chapter 9 closed with the compiler emitting one ModelProto ready to install. bb::install resolves the binding metadata, constructs each bound concrete, and hands back a Node. From there a single function on the public surface drives the entire runtime: Node::poll.

The engine is sans-IO. It holds no socket, no thread pool, no async runtime. The host calls poll in its own executor, the engine returns a vector of EngineStep values, and the host ships envelopes through its transport of choice. Two queues separate concurrency models: the frontier is a single-threaded VecDeque walked under &mut self, and the ingress is a lock-free MPMC queue that external threads push into. Bootstrap is no longer a lifecycle phase. As of 0.3.0 it is a regular function call (for Modules) or a Contract method (for Components) that the host kicks via Node::run_bootstrap(BootstrapTarget). Async Contract dispatch is the only special completion path, and it routes through the same ingress every external producer uses.

The sections below walk the install handoff, the Node::poll cycle, the IngressEvent taxonomy, op invocation, async completion via CompletionHandle, and the introspection surface the host uses to observe what the engine is doing.

Construction via bb::install

bb::install is the only public Node constructor for production code. It verifies the compilation passport stamped onto the model, looks up the binding metadata for every target named in the slice, dedupes shared slot bindings across targets, constructs each bound concrete via the inventory’s construct_fn exactly once, and returns a Node whose engine is ready to dispatch.

// from bytesandbrains/src/install.rs:237-243
pub fn install(
    peer_id: PeerId,
    addresses: Vec<Address>,
    model: ModelProto,
    targets: &[&str],
    config: Config,
) -> Result<Node, InstallError> {

The targets slice is an ordered list of entry-point function names. A single-Node demo passes &["MyModule"]; a peer hosting both partitions of a federated round passes &["Client", "Server"]. The install path resolves each name against model.functions[] (exact match first, then the compiler’s <target>#<hash> content-suffix), parses the matching binding.<target>.<slot> metadata for each target, and groups bindings by slot name in first-seen call order. Any slot whose contributors disagree on (TYPE_NAME, role) fails with InstallError::SlotBindingConflict { slot, conflicts } before any concrete is instantiated (bytesandbrains/src/install.rs:540-571). An empty targets slice is rejected with InstallError::EmptyTargets (bytesandbrains/src/install.rs:247-249).

The addresses bag is shared across every entry in the slice. The install path registers it once against the Node’s own PeerId in the engine’s AddressBook, so every target sees the same local_addresses() view. An empty vec skips self-registration so the Node renders outbound identity-protocol ops as “no addresses” failures at the protocol level rather than synthesizing a /p2p/<PeerId> placeholder.

The Config argument is a slot-keyed map of per-concrete user configs. Concretes whose Config associated type is () need no entry; the inventory’s construct_fn receives a &() default. Per-slot configs apply to the single shared ComponentRef for that slot, so every target jointly declaring the slot sees the same configured instance. Failures from construction, missing inventory entries, slot type mismatches, slot-binding conflicts, and passport verification all surface as a typed InstallError before the first poll, so the host never sees a half-built Node.

A complete single-target install at the user surface looks like this:

// from bytesandbrains/examples/custom_index_hnsw.rs:291-301
let target = compiled.functions[0].name.clone();
let mut node = install(
    PeerId::from(1u64),
    vec![Address::empty()],
    compiled,
    &[target.as_str()],
    // HnswIndex derives `bb::Concrete` with the default
    // `type Config = ()`, so install constructs a fresh
    // instance via `HnswIndex::default()`. The locally-built
    // `index` is only kept here for the dispatch demo below.
    Config::new(),
)?;

A multi-target install (one Node hosting both halves of a federated round) passes the two compiled partition names in install order:

// from bytesandbrains/examples/single_node_federated_learning.rs:343-349
let mut node = install(
    peer,
    addrs,
    compiled,
    &[client_target.as_str(), server_target.as_str()],
    Config::new(),
)?;

The Node returned holds a fully resolved engine: every NodeProto in every target function has its OpDispatch stamped, every slot is bound to a single shared ComponentRef, and every framework syscall is registered. The ingress queue is empty. Any per-target Module bootstrap functions are recorded onto BootstrapState::install_order in slice order, and every Component’s bb::Bootstrap dispatcher is recorded onto BootstrapState::component_bootstraps. Install does not arm the queue: the host kicks bootstrap via Node::run_bootstrap(BootstrapTarget) once it has staged any input formals it intends to supply.

NodeConfig and the production caps

NodeConfig carries the construction-time knobs that bound the engine’s per-cycle work. Every field carries a production-conservative default. The minimal construction reads:

// from bytesandbrains/bb-runtime/src/node/config.rs:201-219
pub fn new(peer_id: PeerId) -> Self {
    Self {
        peer_id,
        cycle_op_budget: DEFAULT_CYCLE_OP_BUDGET,
        max_pending_async: DEFAULT_MAX_PENDING_ASYNC,
        max_outbound_queue: DEFAULT_MAX_OUTBOUND_QUEUE,
        bus_capacity: DEFAULT_BUS_CAPACITY,
        per_hop_budget_ns: DEFAULT_PER_HOP_BUDGET_NS,
        envelope_caps: EnvelopeCaps::default(),
        backpressure_high_water_pct: DEFAULT_HIGH_WATER_PCT,
        backpressure_k_before_silent: DEFAULT_K_BEFORE_SILENT,
        backpressure_min_notice_interval_ns: DEFAULT_MIN_NOTICE_INTERVAL_NS,
        ingress_byte_budget: DEFAULT_INGRESS_BYTE_BUDGET,
        max_app_event_bytes: DEFAULT_MAX_APP_EVENT_BYTES,
        max_invoke_inputs: DEFAULT_MAX_INVOKE_INPUTS,
        max_invoke_bytes: DEFAULT_MAX_INVOKE_BYTES,
        max_completion_result_bytes: DEFAULT_MAX_COMPLETION_RESULT_BYTES,
    }
}

The four caps that shape per-poll behavior:

// from bytesandbrains/bb-runtime/src/node/config.rs:19-34
pub const DEFAULT_CYCLE_OP_BUDGET: Option<usize> = Some(1000);
pub const DEFAULT_MAX_PENDING_ASYNC: Option<usize> = Some(10_000);
pub const DEFAULT_MAX_OUTBOUND_QUEUE: Option<usize> = Some(10_000);
pub const DEFAULT_BUS_CAPACITY: usize = 1024;

cycle_op_budget is the soft per-poll fairness cap. After 1000 op-invocations in a single cycle the engine yields and emits EngineStep::CycleBudgetExceeded { ops_invoked }; the host re-polls to drain the rest. max_pending_async caps the number of in-flight DispatchResult::Async commands; further async dispatches fail synchronously when the cap is hit. max_outbound_queue is the cap OutboundQueue::with_cap is constructed with. The push policy is strict head-eviction FIFO: when a push would push queue.len() to or above the cap, the queue pops from the front, increments dropped_since_last_drain, and then pushes the new envelope at the back:

// from bytesandbrains/bb-runtime/src/framework/outbound_queue.rs:49-60
pub fn push(&mut self, mut env: WireEnvelope) {
    if env.schema_version == 0 {
        env.schema_version = ENVELOPE_SCHEMA_VERSION;
    }
    if let Some(cap) = self.cap {
        while self.queue.len() >= cap {
            self.queue.pop_front();
            self.dropped_since_last_drain += 1;
        }
    }
    self.queue.push_back(env);
}

The newest envelope always lands in the queue; the oldest is the one sacrificed. Phase 8 calls take_dropped_count once per cycle to read and reset the counter and emits a single EngineStep::OutboundDropped { count } if any drops occurred. The shape matters at the boundary: a transport adapter that briefly stalls sees the queue grow, then sheds load by dropping its oldest pending envelopes rather than refusing new ones, so the freshest application state has the best shot at the wire. bus_capacity bounds the typed bus; overflow drops the oldest event and bumps a counter the engine publishes as InfraEvent::BusOverflow on the next cycle.

with_cycle_op_budget, with_max_pending_async, with_max_outbound_queue, with_bus_capacity, and with_per_hop_budget_ns override individual caps. The matching without_* methods disable a cap entirely. with_ingress_byte_budget, with_max_app_event_bytes, with_max_invoke_inputs, with_max_invoke_bytes, and with_max_completion_result_bytes (bb-runtime/src/node/config.rs:326-353) override the boundary allocation caps documented under Ingress boundaries and allocation safety. NodeConfig::new_edge(peer) is a convenience factory that returns a config with the tighter edge-device presets (8 MiB ingress budget, 64 KiB per-event cap, etc.) applied in one call. Node::with_config accepts a replacement config post-install and re-applies the caps to the engine immediately. The typical flow is: let node = bb::install(...)?.with_config(NodeConfig::new(peer).with_cycle_op_budget(500));

The poll cycle

Node::poll is the single drive method. The host’s executor passes a Context<'_>; the engine returns Poll::Pending when there is no work (stashing the waker so the next ingress push or timer maturity wakes the executor) and Poll::Ready(steps) when it has work to report.

// from bytesandbrains/bb-runtime/src/node/mod.rs:935-950
pub fn poll(
    &mut self,
    cx: &mut std::task::Context<'_>,
) -> std::task::Poll<Vec<crate::engine::EngineStep>> {
    let steps = self.engine.poll();
    if let Some(tap) = self.telemetry_tap.as_mut() {
        for step in &steps {
            tap(step);
        }
    }
    if steps.is_empty() {
        self.engine.ingress.register_waker(cx.waker());
        return std::task::Poll::Pending;
    }
    std::task::Poll::Ready(steps)
}

Inside, Engine::poll runs eight phases in a fixed order. Each phase is a no-op when its source queue is empty. The phases are:

  1. Drain ingress. Pop every IngressEvent the queue holds, dispatch each variant to its handler. Inbound envelopes route to route_envelope. Completions route to handle_completion. App events seed module input sites. Invokes batch-seed several inputs under a single ExecId. Off-thread timer matures advance the scheduler clock.
  2. Drain the frontier. Pop each fireable (OpRef, ExecId) pair and call invoke_one. Each invocation produces one EngineStep. The loop breaks when the frontier empties or when cycle_op_budget is hit.
  3. Route bus events. For each NodeEvent on the bus, look up the subscribed NodeSiteIds, write a TriggerValue to each one at a fresh ExecId, and push the site’s downstream consumers onto the frontier.
  4. Poll matured timers. Walk the scheduler heap, dispatch each matured TimerKind to its handler (Sleep and Completion resume a parked CommandId; Interval reschedules itself).
  5. Expire deadlines and drain pending completions. The engine scans pending_async for entries past their deadline_ns and fails them through the typed OpFailed path. Then it drains every PendingCompletion produced by in-cycle hooks and calls handle_completion for each.
  6. Final frontier drain. Cascades produced by Phases 3 to 5 walk to quiescence (subject to the same op budget).
  7. Fire lifecycle phases. Any phase the host queued via Engine::fire_lifecycle pushes its enrolled ops onto the frontier at a fresh ExecId and cascade-drains them in this same cycle.
  8. Drain the outbound queue. Every queued WireEnvelope becomes one EngineStep::SendEnvelope. The host’s transport adapter picks an address from each envelope’s dest_peer_addresses list and ships the bytes.

Engine::poll no longer auto-seeds bootstrap. Install records every Module bootstrap on BootstrapState::install_order and every Component bootstrap on BootstrapState::component_bootstraps without arming the queue. The host kicks the queue via Node::run_bootstrap(BootstrapTarget) (covered below). Between phases 5 and 6 the engine runs an RTT-tracker liveness scan, and the BootstrapComplete or WaitingOnBootstrap step lands at the end of the cycle that drained the last bootstrap op for the in-flight target.

Host-driven bootstrap

Pre-0.3.0 designs spelled bootstrap as a lifecycle phase. The 0.3.0 engine treats bootstrap as a host-driven phase with two surfaces side by side. Module bootstraps record a FunctionProto body the engine dispatches through the standard OpDispatch::FunctionCall path. Component bootstraps run user Rust through the bb::Bootstrap Contract method (Roles and Contracts covers the trait shape). Both surfaces share a single BootstrapState bundle on the engine, and both fire only when the host kicks the queue.

BootstrapState

Engine consolidates every bootstrap field into a single struct (bb-runtime/src/engine/bootstrap.rs:246-299):

pub(crate) struct BootstrapState {
    /// Per-target Module bootstrap metadata. Stamped at install
    /// when a `module_phase = bootstrap` FunctionProto lands.
    module_bootstraps: HashMap<String, ModuleBootstrap>,

    /// Per-slot Component bootstrap metadata. Populated by the
    /// `BootstrapDispatcherRegistration` install walk.
    component_bootstraps: HashMap<String, ComponentBootstrap>,

    /// Append-only sequence of Module bootstrap target names in
    /// install order. The seeder walks front-to-back.
    install_order: Vec<String>,

    /// Host-supplied bootstrap input staging requests parked
    /// awaiting a conflict-free slot.
    pending_requests: VecDeque<OwnedBootstrapRequest>,

    /// Currently executing bootstraps. Vec shape supports
    /// concurrent disjoint Component bootstraps.
    in_flight: Vec<InFlightBootstrap>,

    /// Validated + staged bootstraps ready to fire once `in_flight`
    /// drains (overlap promotion path).
    waiting: VecDeque<QueuedBootstrap>,

    /// Seed pointer into `install_order`. Bumps each time
    /// `maybe_complete_bootstrap` observes a phase drained.
    next_idx: usize,

    /// Coarse "queue still has work" flag the body-op gate
    /// consults to skip the bootstrap path on idle cycles.
    pending: bool,
}

Replaces the pre-0.3.0 bootstrap_function_keys plus bootstrap_next_idx plus bootstrap_pending plus bootstrap_exec_id quartet. Every read and write of bootstrap state goes through BootstrapState; the engine borrows it as one field.

Module bootstrap recording

Module::bootstrap(&self, g: &mut Graph) is the Module trait’s optional second recording method:

// from bytesandbrains/bb-dsl/src/module.rs:117-128
pub trait Module {
    fn name(&self) -> &str;
    fn body(&self, g: &mut Graph);
    fn bootstrap(&self, _g: &mut Graph) {}
}

The compiler emits the bootstrap recording as a separate FunctionProto whose name suffixes the Module name with __bootstrap and stamps MODULE_PHASE_BOOTSTRAP onto the function’s MODULE_PHASE_KEY metadata. Install records each target’s FunctionKey on BootstrapState::install_order in the slice order the host passed targets (so multi-target installs see deterministic bootstrap ordering across partitions). Install does NOT seed the queue. The host kick is explicit so test fixtures, daemons, and orchestrators can stage input formals via the bootstrap request queue before the first body op runs.

Component bootstrap (the bb::Bootstrap Contract)

bb::Bootstrap is the Contract trait every #[derive(bb::Concrete)] type implements. The derive emits a default no-op impl so most concretes pay zero boilerplate; authors override when a Component needs to allocate buffer pools, mmap state, prime a calibration cache, or dial seed peers before its primary Contract methods run. The trait lives at bytesandbrains/bb-runtime/src/contracts/bootstrap.rs:

pub trait Bootstrap {
    type Error: std::error::Error + Send + Sync + 'static;

    fn bootstrap(&mut self, _ctx: &mut BootstrapCtx)
        -> Result<(), Self::Error>
    {
        Ok(())
    }
}

The host fires Component bootstraps explicitly through the slot the bind chain bound the concrete onto via Node::run_bootstrap(BootstrapTarget::Slots(&[slot, ...])). Disjoint Component bootstraps fire concurrently; the touch-set gate (below) parks only the touched components. See Roles and Contracts for the override walkthrough.

Host kick: run_bootstrap

One host entry point drives the queue (bb-runtime/src/node/mod.rs):

pub fn run_bootstrap(
    &mut self,
    target: BootstrapTarget<'_>,
) -> Result<Vec<EngineStep>, BootstrapError>;

BootstrapTarget is the open enum that selects which bootstraps to fire:

pub enum BootstrapTarget<'a> {
    /// Drive every install-order Module bootstrap target on this Node.
    All,
    /// Drive specific Module bootstrap targets by name (with empty inputs).
    ModuleNames(&'a [&'a str]),
    /// Drive Module bootstrap targets with explicit inputs.
    ModuleRequests(&'a [BootstrapRequest]),
    /// Drive Component bootstraps by slot name.
    Slots(&'a [&'a str]),
}

The variant chosen routes through one of the following paths:

  1. BootstrapTarget::All arms the install-order queue, seeds the first Module target, and drives Engine::poll in a loop until every queued bootstrap drains (or one suspends on async). Returns the full Vec<EngineStep> the bootstrap path emitted. Idempotent: a Node whose queue already drained returns an empty vec. Snapshot restore deliberately clears the pending flag so a restored Node does not re-fire bootstrap.

  2. BootstrapTarget::ModuleRequests(&[BootstrapRequest]) is the batch entry for Module bootstraps with input formals. Each BootstrapRequest { target: &str, inputs: &[(&str, &[u8])] } stages owned-form bytes into the target’s input sites via the immediate-fire path. The engine validates the batch atomically up front (unknown target or duplicate batch entry surfaces BootstrapError::UnknownTarget or BootstrapError::AlreadyTransitivelyQueued before any staging happens). BootstrapTarget::ModuleNames(&[&str]) is the sugar form that fills empty-input requests for the no-formals case.

  3. BootstrapTarget::Slots(&[&str]) fires Component bootstraps by slot name. The engine resolves slot → ComponentRef via the bootstrap.component_bootstraps registry the install walk populated and dispatches through the per-T Bootstrap dispatcher.

BootstrapRequest and input staging

BootstrapRequest carries the target name plus the input formals the host supplies. The shape:

pub struct BootstrapRequest<'a> {
    pub target: &'a str,
    pub inputs: &'a [(&'a str, &'a [u8])],
}

run_bootstrap(BootstrapTarget::ModuleRequests(_)) validates the batch in one pass before any bytes stage. BootstrapError carries the per-call rejection variants:

  • UnknownTarget { target, available } rejects a name absent from BootstrapState::install_order.
  • AlreadyTransitivelyQueued { target } rejects a re-staging of a target the queue already holds. Cycle defense.
  • UnknownInput { target, input, declared } rejects an input name the target’s bootstrap recording did not declare via g.input.
  • MissingInput { target, input } rejects a gap when a declared formal has no matching entry in inputs.

Validated requests follow the Principle 1a copy (try_charge then try_reserve_exact then extend_from_slice, bb-runtime/src/engine/core.rs:1534-1567) and the framework-owned BytesValue carriers land in the bootstrap’s slot table entries at the body’s fresh ExecId. The caller’s borrowed &[u8] slices may drop the moment run_bootstrap returns. A bootstrap that takes no formals records zero g.input calls; the host kicks it via Node::run_bootstrap(BootstrapTarget::All) or Node::run_bootstrap(BootstrapTarget::ModuleNames(&["<target>"])).

Per-component gate: is_op_locked

While a bootstrap holds an in-flight entry, the engine gates body-phase ops by the bootstrap’s touch set: the closure of every ComponentRef the bootstrap body reaches. The gate runs in Engine::is_op_locked (bb-runtime/src/engine/core.rs:1762-1806). Resolution order:

  1. No in-flight bootstraps fires the op (gate dormant).
  2. The op’s ExecId descends from some in-flight bootstrap’s ExecId via the pending_calls.parent_exec_id chain. The op fires. The bootstrap body itself and its sub-FunctionCalls invoke ops freely.
  3. Resolve the touched ComponentRef from the op’s NodeProto via SLOT_ID_KEY → slot_id_to_cref. If the touched cref sits in some in-flight bootstrap’s touch_set, the op parks on the frontier. Stateless syscalls (no slot_id stamp, no role binding) fire because they reach no component.

pop_frontier_fireable (bb-runtime/src/engine/core.rs:2096-2110) scans the frontier for the first entry the gate accepts; parked ops stay on the frontier until the in-flight set drops. Disjoint components keep firing during a bootstrap.

The touch set is computed once at install time by Engine::compute_touch_set (bb-runtime/src/engine/core.rs:1145-1196): walk the bootstrap function body, read every NodeProto’s slot_id stamp into the set, and recurse into every FunctionCall callee. The result stamps onto BootstrapState::module_bootstraps[name].touch_set before any bootstrap fires. Gate-time is_op_locked reads the pre-stamped set in O(1).

Conflict queue and concurrent in-flight bootstraps

BootstrapState::process_pending_requests (bb-runtime/src/engine/bootstrap.rs:459-497) drains the parked request queue once per poll. For each request the engine looks up the target’s touch set and compares against every currently in-flight bootstrap’s touch_set. Disjoint targets surface as ReadyBootstraps the engine seeds immediately; overlapping ones park as QueuedBootstraps in bootstrap.waiting.

BootstrapState::on_bootstrap_drained (bb-runtime/src/engine/bootstrap.rs:499-522) retires the in-flight entry by ExecId, then walks waiting once and promotes any waiter whose touch set no longer conflicts. Engine::maybe_complete_bootstrap (bb-runtime/src/engine/core.rs:1824-1887) is the call site: it runs after every drain phase, advances next_idx, and fires promoted waiters in-cycle so the host sees every BootstrapComplete step and the body’s first ops in a single poll when the budget permits.

BootstrapStatus

Node::bootstrap_status() (bb-runtime/src/node/mod.rs:925-927) returns BootstrapStatus::{Idle, Running, WaitingForInput} without advancing any queue. Running means at least one entry occupies bootstrap.in_flight. WaitingForInput means the install-order queue still has unseeded targets or host-staged requests sit on pending_requests / waiting. Idle otherwise. The host consults this when deciding whether to keep polling or surface a prompt for more inputs.

The two queues

The frontier and the ingress are deliberately separate data structures with separate concurrency models.

The frontier is a single-threaded VecDeque<(OpRef, ExecId)>. The engine pops the head, invokes the op, writes its outputs, and pushes every downstream consumer whose inputs are now all ready. No locking, no atomics, no allocation per push after warm-up. FIFO ordering is load-bearing: two ops that become ready in the same cycle fire in the order they were pushed.

The ingress is a lock-free MPMC queue plus an AtomicWaker slot. External threads push IngressEvents; the engine’s Phase 1 drains them on the next poll. The lock-free push path costs one CAS plus an atomic store plus one waker swap. The Node’s only Send + Sync seam to other threads is the Arc<IngressQueue> handle. Everything else lives behind the single-threaded Engine.

// from bytesandbrains/bb-runtime/src/ingress.rs:182-252
pub struct IngressQueue {
    queue: ConcurrentQueue<IngressEvent>,
    waker: AtomicWaker,
    dropped_overflow: AtomicU64,
    completion_result_cap: AtomicUsize,
}

impl IngressQueue {
    pub fn new() -> Self {
        Self::with_capacity(DEFAULT_INGRESS_CAPACITY)
    }

    #[allow(clippy::result_large_err)]
    pub fn push(&self, event: IngressEvent) -> Result<(), IngressEvent> {
        match self.queue.push(event) {
            Ok(()) => {
                self.waker.wake();
                Ok(())
            }
            Err(PushError::Full(ev)) => {
                self.dropped_overflow.fetch_add(1, Ordering::Relaxed);
                Err(ev)
            }
            Err(PushError::Closed(ev)) => Err(ev),
        }
    }
}

DEFAULT_INGRESS_CAPACITY is bus_capacity * 4 (4096 with the spec default). Overflow returns the event back to the caller and bumps the dropped_overflow counter; the transport adapter decides whether to retry, drop with a metric, or escalate as back-pressure. drain_all() is what Phase 1 calls. register_waker is what Node::poll calls when it returns Pending. len() is approximate (the MPMC queue’s len is a snapshot).

Off-thread producers reach the queue through Node::ingress_handle(), which returns an IngressQueueRef cheap-clone wrapper:

// from bytesandbrains/bb-runtime/src/node/mod.rs:682-684
pub fn ingress_handle(&self) -> crate::ingress::IngressQueueRef {
    crate::ingress::IngressQueueRef::new(self.engine.ingress_queue_handle())
}

A transport adapter on its own thread holds an IngressQueueRef, receives bytes from the socket, decodes them into a WireEnvelope, and pushes IngressEvent::EnvelopeFrom { src_peer, envelope }. The push wakes the engine; the next poll cycle drains and routes.

IngressEvent variants

Every external entry point pushes one of eight variants onto the ingress:

// from bytesandbrains/bb-runtime/src/ingress.rs:47-158
pub enum IngressEvent {
    EnvelopeFrom {
        src_peer: crate::ids::PeerId,
        envelope: crate::envelope::WireEnvelope,
        src_observed_address: Option<crate::framework::Address>,
    },
    AppEvent {
        module_name: String,
        input_name: String,
        value_bytes: Vec<u8>,
    },
    TimerMatured {
        at_ns: u64,
    },
    Invoke {
        module_name: String,
        inputs: Vec<(String, Vec<u8>)>,
        exec_id: crate::ids::ExecId,
    },
    Completion {
        cmd_id: CommandId,
        results: Vec<Vec<u8>>,
    },
    CompletionFailed {
        cmd_id: CommandId,
        detail: String,
    },
    SendFailed {
        wire_req_id: u64,
        peer: Vec<u8>,
        reason: &'static str,
    },
    AppIngressError {
        source: AppIngressSource,
        byte_count: usize,
        kind: AppIngressErrorKind,
    },
}

EnvelopeFrom carries an inbound wire envelope plus the transport’s identified source peer and an optional reflexive src_observed_address (the dialer endpoint the adapter actually saw). The in-process router populates the observed field via IngressEvent::from_in_process(src_peer, envelope) at bb-runtime/src/ingress.rs:167-177; production transports source it from whatever NAT-traversed multiaddr the connection negotiated. Phase 1 of the poll cycle merges both the sender-claimed envelope.src_peer_addresses bag and the observed address into the receiver’s AddressBook entry for src_peer (bb-runtime/src/engine/poll.rs:497-505,1005-1062) so future replies can dial back on any reachable interface. The engine routes each SlotFill by the multiaddr in its dest_suffix. AppEvent and Invoke are the host-side data entry points; the convenience methods deliver_event and invoke on Node wrap the push and verify the module name exists. TimerMatured is an off-thread clock hook that advances the scheduler when the host owns its own clock. Completion and CompletionFailed are the async-Contract paths: a worker thread or remote RPC reports the result of a ContractResponse::Later dispatch. SendFailed lets a transport adapter report an outbound delivery failure without waiting for the engine’s per-poll TTL drain to evict the request-tracker entry. AppIngressError is the cross-thread bridge for off-thread sinks (notably CompletionSink::complete when the per-completion byte cap fires) that lack direct &mut bus access. The engine drains the variant and publishes a matching InfraEvent::AppIngressError so subscribers see the rejection; synchronous Node::deliver_event and Node::invoke publish directly without routing through this variant.

The three high-traffic variants in production are EnvelopeFrom, Completion, and CompletionFailed. The wire and async paths run through them. The host-side entry points (deliver_event, invoke) push AppEvent and Invoke from the application thread that owns the data.

Op invocation and dispatch_atomic

Each frontier pop calls Engine::invoke_one. The engine looks at the op’s pre-stamped OpDispatch (one of four variants) and routes the call. The two variants that dominate at runtime are Stateless for framework syscalls and Atomic { component_ref } for every Contract-method op.

Atomic dispatch is the load-bearing surface for every bound concrete. Per Roles and Contracts the framework speaks to user code through seven method-style Contract trait families (Index, Aggregator, Model, Codec, DataSource, PeerSelector, Backend) plus the Protocol slot, which authors register with the declarative bb::register_protocol!{} macro rather than a Contract trait. The #[derive(bb::<Role>)] macros bridge each Contract impl to the engine-internal <Role>Runtime trait. dispatch_atomic is the entry point that bridge emits.

When the engine fires an atomic op:

  1. It takes the dispatching component out of Engine.components via take_component (which is Option::take on the slot) so a live ComponentsView can hand sibling components to the user’s method while the dispatching slot is held exclusively.
  2. It builds the per-call RuntimeResourceRef<'_> by split-borrowing the framework primitives (scheduler, request tracker, peer governor, address book, outbound queue), the bus, the ingress, and the per-call context (OpRef, ExecId, self_peer, the next_command_id allocator).
  3. It calls the bridge’s dispatch_atomic arm, which downcasts each input &dyn SlotValue to the expected primitive, opens a CompletionHandle via ctx.open_completion::<R, E>(), and calls the user’s Contract method.
  4. It interprets the returned ContractResponse:
    • ContractResponse::Now(Ok(value)) translates to DispatchResult::Immediate(bincode::serialize(value)). The engine writes the bytes to the op’s output sites at the current ExecId and pushes ready consumers onto the frontier.
    • ContractResponse::Now(Err(e)) becomes the dispatch error and fails the op through the OpFailed path.
    • ContractResponse::Later translates to DispatchResult::Async(handle.cmd_id()). The engine records the op in pending_async, emits EngineStep::AsyncSuspended { op_ref, exec_id, cmd_id }, and leaves the op parked until a Completion lands on the ingress.

After the call returns, the engine drains ctx.pending_completions (in-cycle hooks that called ctx.complete_command) onto the engine’s pending_completions queue and restores the dispatching component into Engine.components so the next op can dispatch.

CompletionHandle and async dispatch

CompletionHandle<R, E> is the type the Contract method holds while its work is in flight. It carries the CommandId the engine parked the op behind and a shared Arc<dyn CompletionSink> that receives the serialized result.

// from bytesandbrains/bb-runtime/src/completion.rs:23-66
pub struct CompletionHandle<R, E> {
    cmd_id: CommandId,
    sink: Arc<dyn CompletionSink>,
    _marker: PhantomData<fn() -> (R, E)>,
}

impl<R, E> CompletionHandle<R, E> {
    pub fn new(cmd_id: CommandId, sink: Arc<dyn CompletionSink>) -> Self {
        Self {
            cmd_id,
            sink,
            _marker: PhantomData,
        }
    }

    pub fn cmd_id(&self) -> CommandId {
        self.cmd_id
    }
}

impl<R, E> CompletionHandle<R, E>
where
    R: serde::Serialize,
    E: std::fmt::Display,
{
    pub fn complete(self, result: Result<R, E>) {
        match result {
            Ok(value) => {
                let bytes = bincode::serialize(&value).unwrap_or_default();
                self.sink.complete(self.cmd_id, &bytes);
            }
            Err(e) => {
                let detail = e.to_string();
                self.sink.fail(self.cmd_id, &detail);
            }
        }
    }
}

The handle is Send + Sync because the inner Arc<dyn CompletionSink> is. Contract methods can ship it across thread boundaries freely. bb-runtime implements CompletionSink on the Node’s IngressQueue. The sink takes the bincode-serialized bytes as a borrowed &[u8] and the failure detail as a borrowed &str (Principle 1a: the sink copies into framework-owned storage before returning), then pushes IngressEvent::Completion or IngressEvent::CompletionFailed. The engine drains those on the next poll cycle and unparks the suspended op.

The ContractResponse<R, E> discriminant on the Contract method’s return type lets the implementation pick between an inline result that skips the park cycle and a deferred result that uses the handle:

// from bytesandbrains/bb-runtime/src/completion.rs:71-76
pub enum ContractResponse<R, E> {
    Now(Result<R, E>),
    Later,
}

The worker-thread pattern is the canonical structure for compute-heavy or non-blocking work. The Contract method captures the CompletionHandle onto a work item, ships it to a worker via an mpsc channel, and returns Later. The worker does the work and calls completion.complete(...) from off the engine thread; the result lands back on the ingress as IngressEvent::Completion. The custom_index_hnsw example wires this end to end:

// from bytesandbrains/examples/custom_index_hnsw.rs:193-234
impl Index for HnswIndex {
    type Vector = [f32];
    type Error = HnswError;

    fn add(
        &mut self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        vec: &Self::Vector,
        completion: CompletionHandle<u64, Self::Error>,
    ) -> ContractResponse<u64, Self::Error> {
        self.send(WorkItem::Add {
            vec: vec.to_vec(),
            completion,
        });
        ContractResponse::Later
    }

    fn search(
        &self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        query: &Self::Vector,
        k: u32,
        completion: CompletionHandle<Vec<(u64, f32)>, Self::Error>,
    ) -> ContractResponse<Vec<(u64, f32)>, Self::Error> {
        self.send(WorkItem::Search {
            query: query.to_vec(),
            k,
            completion,
        });
        ContractResponse::Later
    }

    fn remove(
        &mut self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        _id: u64,
        completion: CompletionHandle<(), Self::Error>,
    ) -> ContractResponse<(), Self::Error> {
        self.send(WorkItem::Remove { completion });
        ContractResponse::Later
    }
}

The matching worker side calls completion.complete(Ok(value)) from its own thread once the work finishes. The push routes through the ingress queue the framework already wired into the handle; no shared state crosses the boundary beyond the queue itself.

For an immediate result the implementation returns ContractResponse::Now(Ok(value)) and drops the handle unused. The engine never registers the op in pending_async and writes the result directly into the op’s output sites.

What EngineStep carries

Engine::poll returns a Vec<EngineStep> per cycle. The enum has seventeen variants; each one is something the host can observe and react to. The complete list, in declaration order:

// from bytesandbrains/bb-runtime/src/engine/step.rs:13-194
pub enum EngineStep {
    OpCompleted        { op_ref, exec_id, sites_written },
    AsyncSuspended     { op_ref, exec_id, cmd_id },
    SendEnvelope       (WireEnvelope),
    AppEvent           { module_name, topic, value_bytes },
    LifecycleFired     { phase },
    BootstrapComplete,
    WaitingOnBootstrap,
    OpFailed           { op_ref, exec_id, error },
    CycleBudgetExceeded { ops_invoked },
    OutboundDropped    { count },
    WireDecodeFailed   { hash, payload_size, detail },
    WireReceiveFailed  { src_peer, fill_index, actual_hash,
                         payload_size, kind },
    WireTimeout        { wire_req_id, target_site,
                         started_at_ns, parked_op },
    PeerBlocked        { peer, reason },
    PeerDown           { peer },
    PeerUp             { peer },
    PeerResolveFailed  { peer, op_ref, exec_id },
}

Grouped by purpose:

Op outcomes (four). OpCompleted { op_ref, exec_id, sites_written } is the synchronous-completion path: the op fired, wrote its outputs, and pushed downstream consumers. AsyncSuspended { op_ref, exec_id, cmd_id } is the parking path: the op returned ContractResponse::Later and is now waiting on the named cmd_id. OpFailed { op_ref, exec_id, error } carries a typed OpError from either a synchronous failure (ContractResponse::Now(Err(_))) or a deadline-expired pending_async entry; the same context is mirrored on the bus as InfraEvent::OpFailure. LifecycleFired { phase } fires when the host queued a lifecycle phase via Engine::fire_lifecycle; the engine drained that phase’s enrolled ops in this cycle and surfaces the phase name (such as "PreShutdown") so the host can hook teardown.

Bootstrap signals (two). BootstrapComplete fires at most once per Node lifetime, on the cycle that drained the last bootstrap op. WaitingOnBootstrap is the suspension signal: at least one bootstrap-phase op returned async, and the engine is gating body work until the matching completion lands on the ingress.

App-side I/O (two). SendEnvelope(WireEnvelope) is the outbound shipment path: the transport adapter picks one entry from envelope.dest_peer_addresses and writes the bytes. AppEvent { module_name, topic, value_bytes } carries a function.output landing or a mid-cycle AppEmit / AppNotify; value_bytes is empty for marker-only notifications.

Back-pressure surfaces (four). CycleBudgetExceeded { ops_invoked } is the soft fairness yield, emitted at most once per poll. OutboundDropped { count } is the OutboundQueue head-eviction count since the previous poll, also emitted at most once. WireDecodeFailed { hash, payload_size, detail } is the inbound payload-decode telemetry hook; the envelope’s slot fill was dropped before any user code ran. WireReceiveFailed { src_peer, fill_index, actual_hash, payload_size, kind } mirrors InfraEvent::WireReceiveError for per-fill typed-decode failures (allocation, budget, or wire-type mismatch). Other fills in the same envelope still deliver: this is the partial-delivery telemetry hook.

Wire and peer health (five). WireTimeout { wire_req_id, target_site, started_at_ns, parked_op } fires when RequestTracker::drain_stale evicts an in-flight chain entry whose per-entry TTL elapsed without a matching response; the originator’s local DAG continuation parked behind parked_op is failed with “chain timeout”. PeerBlocked { peer, reason: BlockReason } fires on the inbound side when PeerGovernor::check_inbound rejects an envelope before any slot is written; the BlockReason is one of Blocklisted, NotAllowlisted, or Cooldown { retry_ns }. PeerDown { peer } and PeerUp { peer } are the consecutive-failure transitions tracked by PeerGovernor::record_failure / record_success; each fires at most once per crossing. PeerResolveFailed { peer, op_ref, exec_id } fires when a wire::Send cannot resolve its destination’s addresses against the AddressBook: either the peer is unknown, its address list is empty, or the Send op’s peer input did not carry a valid PeerId. The Send produces no envelope; the host reacts via this event and the mirrored InfraEvent::PeerResolveFailure.

The full enum, with the variant-level field docs, lives at bytesandbrains/bb-runtime/src/engine/step.rs:13-194.

App-side entry points and introspection

The host pushes external work through three convenience methods on Node plus the off-thread ingress_handle for transport adapters:

// from bytesandbrains/bb-runtime/src/node/mod.rs:958-1074
pub fn deliver_inbound(
    &mut self,
    src_peer: crate::ids::PeerId,
    bytes: &[u8],
) -> Result<(), crate::errors::delivery::DeliveryError>;

pub fn deliver_event(
    &mut self,
    module: &str,
    input: &str,
    value_bytes: &[u8],
) -> Result<(), crate::errors::delivery::DeliveryError>;

pub fn invoke(
    &mut self,
    module: &str,
    inputs: &[(&str, &[u8])],
) -> Result<crate::ids::ExecId, crate::errors::delivery::DeliveryError>;

Every external boundary takes a borrowed slice. The framework copies the bytes into a framework-owned Vec<u8> inside the call. The caller may free the slice the moment the call returns. Many transport stacks (libp2p reads, QUIC io_uring completion buffers, DPDK queues) hand the framework a pointer into THEIR memory. The framework must not keep a reference past the call (the transport reclaims the buffer immediately on return) and must not take ownership (the transport’s allocator owns the buffer, not the framework’s). The copy IS the ownership transition.

deliver_inbound decodes the wire bytes through EnvelopeCodec::decode_capped (respecting the envelope_caps cap list) and pushes EnvelopeFrom. deliver_event pushes a single AppEvent onto the named module’s named input. invoke allocates a fresh ExecId, pushes Invoke with the supplied inputs, and returns the ExecId so the host can correlate downstream AppEvent / OpCompleted steps back to the call.

Ingress boundaries and allocation safety

External boundaries entering the engine never panic on allocation failure or budget exhaustion. The runtime caps the payload against the matching NodeConfig knob, charges the cumulative byte total against Engine::ingress_byte_budget, and try_reserve_exacts framework-owned storage. Any failure returns a typed DeliveryError synchronously and publishes a matching InfraEvent::AppIngressError (host-pushed surface) or InfraEvent::WireReceiveError (wire surface) on the bus. The offending bytes drop at the boundary, the engine continues processing other envelopes and events, and the parked op the failure interrupts times out naturally.

The per-source caps and the cumulative ingress budget form a two-level defence:

// from bytesandbrains/bb-runtime/src/node/config.rs:51-72,165-195
pub const DEFAULT_INGRESS_BYTE_BUDGET: usize = 256 * 1024 * 1024;
pub const DEFAULT_MAX_APP_EVENT_BYTES: usize = 1024 * 1024;
pub const DEFAULT_MAX_INVOKE_INPUTS: usize = 100;
pub const DEFAULT_MAX_INVOKE_BYTES: usize = 10 * 1024 * 1024;
pub const DEFAULT_MAX_COMPLETION_RESULT_BYTES: usize = 4 * 1024 * 1024;

ingress_byte_budget is the total in-flight ingress bytes the engine holds across the ingress queue plus the slot table plus pending async completion buffers. Default 256 MiB; the NodeConfig::edge() preset tightens to 8 MiB for resource-bound devices. max_app_event_bytes is the per-call cap for deliver_event. max_invoke_inputs plus max_invoke_bytes cap the count and cumulative payload of a single invoke. max_completion_result_bytes caps the CompletionSink::complete result. Each per-source cap rejects oversize single payloads with DeliveryError::OversizePayload; the cumulative budget guards sustained load with DeliveryError::BudgetExceeded. Boundary callers try_charge before installing a payload, and the slot-table writer releases the charge through SlotValue::charged_bytes on overwrite or eviction (bb-runtime/src/engine/core.rs:540-552). Hot-path cost is one saturating-add plus one comparison per fill admission.

CompletionSink::complete and fail follow the same shape:

// from bytesandbrains/bb-runtime/src/runtime.rs:29-97
impl CompletionSink for IngressQueue {
    fn complete(&self, cmd_id: CommandId, result_bytes: &[u8]) {
        // cap-check, charge ingress_byte_budget, try_reserve_exact,
        // extend_from_slice, push IngressEvent::Completion
    }
    fn fail(&self, cmd_id: CommandId, detail: &str) {
        // truncate at COMPLETION_DETAIL_CAP (4 KiB) on a UTF-8
        // character boundary, push IngressEvent::CompletionFailed
    }
}

The bytes never linger as a borrowed slice past the call. The detail string is truncated rather than rejected so the host’s display message always lands. Failures publish InfraEvent::AppIngressError { source: Completion { command }, ... } and drop. The parked op times out naturally on the host side, which is the same surface as a missing completion.

Inside the engine, normal Rust allocation patterns apply. Components are designed to play by the runtime contract: a Vec::push that runs out of memory inside a Contract body is a process abort, not a framework-handled failure. The fallibility line lives at the boundary itself.

Introspection lives on the same surface:

// from bytesandbrains/bb-runtime/src/node/mod.rs:501-538
pub fn incarnation(&self) -> u64;
pub fn loaded_modules(&self) -> Vec<&str>;
pub fn linked_components(&self) -> Vec<&ComponentHandle>;
pub fn peer_id(&self) -> crate::ids::PeerId;
pub fn execution_state(
    &self,
    exec_id: crate::ids::ExecId,
) -> Option<&crate::engine::ExecutionState>;
pub fn pending_async_count(&self) -> usize;
pub fn engine_stats(&self) -> crate::engine::EngineStats;

engine_stats() returns the hot-path counters: frontier length, bus length, pending-async count, slot-table occupancy, ingress depth, outbound-queue depth, event-subscription count, registered component count, and graph-slot count. The numbers are useful for the host’s observability dashboard and for deciding when to throttle the input side.

set_telemetry_tap(closure) installs a per-EngineStep observer that the Node invokes once per produced step. The tap runs inside poll, so the host can wire it to a structured logger without touching the returned vec. The closure type is FnMut(&EngineStep) + 'static.

The Node also exposes peer-governor controls (block_peer, unblock_peer, set_allowlist), address-book mutators (add_peer, drop_peer), local-address accessors (local_addresses, add_local_address, forget_local_address), and read-only views on the peer health (peer_health, resolve_peer_addresses).

PeerState, PeerHealth, and the block reasons

The engine’s per-peer bookkeeping lives in a single bundle on FrameworkComponents. PeerState collapses four sub-primitives under one field so component authors reach them through ctx.peer_state.{gate, governor, backoff, backpressure} instead of four siblings on the framework bundle:

// from bytesandbrains/bb-runtime/src/framework/peer_state.rs:21-37
pub struct PeerState {
    pub gate: PeerGate,
    pub governor: PeerGovernor,
    pub backoff: BackoffTable,
    pub backpressure: BackpressureTracker,
}

PeerGate is the named concurrency limiter consulted by Limit.Acquire / Limit.Release syscalls. PeerGovernor is the single source of truth for blocklist / allowlist policy plus per-peer health tracking; the engine’s Phase 1 envelope router calls check_inbound, and the compiler-inserted PeerHealthGate syscall op upstream of every wire::Send calls check_outbound. BackoffTable is the per-peer exponential backoff schedule the outbound gate consults at check_outbound time. BackpressureTracker is the receiver-side per-peer state for the backpressure protocol: it tracks notices emitted to each sender, the duplicate-suppression window, and the K-then-silent fallback consulted at Phase 1 when ingress depth crosses the high-water mark.

PeerHealth is the per-peer snapshot the governor publishes:

// from bytesandbrains/bb-runtime/src/framework/peer_governor.rs:52-65
pub struct PeerHealth {
    pub consecutive_failures: u32,
    pub last_event_ns: u64,
    pub down: bool,
}

The default failure threshold is 5 (DEFAULT_FAILURE_THRESHOLD: u32 = 5 at bytesandbrains/bb-runtime/src/framework/peer_governor.rs:33): five consecutive record_failure calls without an intervening record_success mark down = true and produce a LifecycleTransition::WentDown, which the engine surfaces as EngineStep::PeerDown. A subsequent success after a down streak produces LifecycleTransition::CameUp and surfaces as EngineStep::PeerUp. NodeConfig does not currently expose a threshold override; production tuning is via the PeerGovernor::with_failure_threshold builder during construction.

BlockReason is the rejection taxonomy that rides on PeerBlocked { peer, reason }:

// from bytesandbrains/bb-runtime/src/framework/peer_governor.rs:36-49
pub enum BlockReason {
    Blocklisted,
    NotAllowlisted,
    Cooldown { retry_ns: u64 },
}

Blocklisted is the explicit block_peer deny. NotAllowlisted fires when an allowlist is configured (set_allowlist(Some(_))) and the peer is not on it. Cooldown { retry_ns } is the outbound gate’s failure-driven back-off; the inbound path never produces Cooldown because the gate that consults BackoffTable only sits upstream of wire::Send.

BackoffTable is exponential with a configurable base and cap. The defaults match a fast initial retry that caps out at a minute:

// from bytesandbrains/bb-runtime/src/framework/backoff_table.rs:22-27
pub const DEFAULT_BASE_NS: u64 = 10_000_000;
pub const DEFAULT_MAX_DELAY_NS: u64 = 60_000_000_000;

10 ms base, 60 s cap. The schedule is delay(n) = min(BASE_NS * 2^n, MAX_DELAY_NS). A peer’s BackoffState carries attempts, last_attempt_ns, and next_retry_ns; should_retry(peer, now_ns) returns true when no failure was ever recorded or when now_ns >= state.next_retry_ns. A record_success removes the entry entirely so the next attempt proceeds without a cooldown.

Deadline propagation: strategy and lineage

A request that touches three peers carries the latency tail of all three. Fixed per-call timeouts paper over the variance: pick a number that catches the slow path and the fast path starves; pick one that fits the fast path and the slow path fails healthy work. Dean and Barroso (The Tail at Scale, CACM 2013) called this the central scaling problem in distributed systems and argued for per-call deadlines that flow through the call graph the same way distributed-tracing spans do. The engine takes that posture end-to-end: every wire.Send carries a deadline, every receiver inherits a remaining budget, and the engine reaps anything that runs past its allotted nanoseconds.

The lineage begins at compile time with a graph walk over the cross- node DAG. derive_wire_deadlines iterates every function and every NodeProto, picks out each wire.Send, reads the optional chain_depth metadata (defaulting to 1 for single-hop sends), and stamps deadline_ns = chain_depth × per_hop_budget_ns as an attribute (bytesandbrains/bb-compiler/src/derive_wire_deadlines.rs:42-64

  • :73-88 for the idempotent upsert). At runtime the wire envelope carries the same budget across the network: the WireEnvelope.remaining_deadline_ns proto field is documented as Dapper-style deadline propagation (bytesandbrains/proto/bb_core.proto:92-98), the receiver unpacks it into the per-ExecId InboundContext on arrival (bytesandbrains/bb-runtime/src/engine/poll.rs:657-661), and a wire.Send forwarding inside a chain forwards the budget minus elapsed local time on the next envelope (bytesandbrains/bb-runtime/src/engine/poll.rs:776-789). Phase 5 of the poll cycle drains expired entries from pending_async, failing each through the typed OpFailed path with kind Timeout (bytesandbrains/bb-runtime/src/engine/poll.rs:99-121). The shape mirrors the per-span deadline propagation described in Sigelman et al., Dapper, a Large-Scale Distributed Systems Tracing Infrastructure (Google Technical Report, 2010): one budget is minted at the root, every hop inherits the remainder, and the leaf enforces by clock.

The runtime adaptive layer refines the compile-time number once the network has actually been observed. RttTracker::estimate_budget_ns walks five tiers (per-edge, per-site, per-chain, global prior, and the static NodeConfig.per_hop_budget_ns fallback) and returns the first warm EMA it finds (bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:283-321). The EMA itself is canonical Jacobson smoothing with α = 1/8 and β = 1/4 and a budget of SRTT + 4·RTTVAR (bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:82-122). The smoothing constants come from Karn and Partridge, Improving Round-Trip Time Estimates in Reliable Transport Protocols (ACM SIGCOMM 1987), later codified as RFC 6298. Three layers compose: compile-time deadlines give every deployment a sane baseline without runtime measurement; runtime EMA refines it once warm samples exist; engine expiry in Phase 5 is the final enforcer. The same code path carries the budget from the compiler pass that stamps it onto the IR, through the envelope that ships it to the next hop, and into the poll cycle that reaps anything past its time.

References

Backpressure: explicit signal, eventual silence

Backpressure under congestion is well-studied territory. Receivers that have no way to push back on overloaded senders end up dropping work silently, and senders that read silence as success amplify the overload by the next round. Welsh, Culler, and Brewer (SEDA, SOSP 2001) made the inverse case canonical: every stage exposes its queue depth, and upstream stages observe the depth and adapt their submission rate. Without that signal an overloaded service degrades from “slower” to “silently lossy” the moment the queue saturates. The engine takes the SEDA posture at the per-peer granularity: hop-local backpressure is a typed wire op, not a fall-through to ingress overflow.

The substrate implements the signal as a typed envelope and parks the receiver-side state on PeerState alongside the existing health primitives. BackpressureTracker is the receiver-side per-peer bookkeeping (bytesandbrains/bb-runtime/src/framework/backpressure_tracker.rs:133-138), joining gate, governor, and backoff as the fourth sub-field (bytesandbrains/bb-runtime/src/framework/peer_state.rs:21-37). Detection composes with two existing primitives. PhiAccrualState (see the deadline-propagation section above) supplies the BackoffCause::PhiAccrual trigger when a sender’s inter-arrival distribution flips to Suspect. IngressQueue depth tracking is snapshotted at Phase 1 entry on phase1_pre_drain_depth (bytesandbrains/bb-runtime/src/engine/poll.rs:197) so the per-envelope handler can compare the pre-drain depth against the configured high-water fraction without re-reading a queue that has already been drained to zero. The notice itself rides as a one-fill WireEnvelope whose type_hash matches backoff_notice_type_hash() (bytesandbrains/bb-runtime/src/framework/backpressure_notice.rs:153-155), built by build_backoff_notice_envelope (bytesandbrains/bb-runtime/src/framework/backpressure_notice.rs:166-195). The K-then-silent threshold defaults to 3 on DEFAULT_K_BEFORE_SILENT (bytesandbrains/bb-runtime/src/framework/backpressure_tracker.rs:50-56).

The K-then-silent design separates cooperative clients from adversarial or buggy ones with a small explicit budget. Cooperative clients receive up to K typed BackoffNotice envelopes; each one updates the sender’s BackoffTable via record_remote_advisory so the existing BackoffGateTx consultation gates the next outbound send. Adversarial or buggy clients that keep sending after K notices flip into silent-drop mode: the receiver’s Phase 1 envelope handler returns immediately on is_silent_drop_active and the envelope is dropped without further notice emission (bytesandbrains/bb-runtime/src/engine/poll.rs:485-492). The full decision lives in maybe_emit_backoff_notice (bytesandbrains/bb-runtime/src/engine/poll.rs:1074-1157), and the sender-side ingest path that decodes the notice and updates local state lives in ingest_backoff_notice (bytesandbrains/bb-runtime/src/engine/poll.rs:1008-1060). K = 3 was chosen to match the RttTracker warmth threshold of three samples at bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:126-128: an EMA is allowed to win the fallback hierarchy once three observations have arrived, and a sender is allowed three explicit notices before the receiver concludes the cooperation contract has broken. All three knobs are config-driven on NodeConfig (bytesandbrains/bb-runtime/src/node/config.rs:90-112) with the matching with_backpressure_* builders (bytesandbrains/bb-runtime/src/node/config.rs:204-227).

Composition reuses the framework’s existing observability and selection surfaces. A PeerSelector Component can read PeerGovernor state via ctx.peer_state.governor and bias toward healthy peers; the PeerHealth snapshot already carries consecutive_failures and down, and the backpressure path’s ingest_backoff_notice records a matching record_failure on the governor so a sender that keeps receiving notices crosses the 5-consecutive-failure threshold without any selector-side wiring. Two typed bus events surface the local overload decision: InfraEvent::BackoffNoticeSent { peer, cause, min_backoff_ns } fires on every emission and InfraEvent::SilentDropActive { peer } fires once when the K threshold is crossed (bytesandbrains/bb-runtime/src/bus.rs:125-148). Both events ride the same typed bus a PeerSelector already subscribes to, so the same EventSource wiring that surfaces PeerSuspect / PeerDown carries the backpressure signals into tracing without bespoke plumbing.

References

RttTracker EWMA constants and the φ-accrual detector

Adaptive deadlines for every wire round-trip come from framework::RttTracker, a per-NodeSiteId Jacobson EMA bank. The EMA itself is canonical RFC 6298 §2 with α = 1/8 and β = 1/4 (the “observe with default rates” call wraps a _with_alpha_beta call that uses alpha_shift = 3, beta_shift = 2):

// from bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:85-87
pub fn observe(&mut self, sample_ns: u64) {
    self.observe_with_alpha_beta(sample_ns, 3, 2);
}

The deadline derivation is the canonical Karn/Partridge recommendation, SRTT + 4·RTTVAR:

// from bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:119-122
pub fn budget_ns(&self) -> u64 {
    self.srtt_ns
        .saturating_add(self.rttvar_ns.saturating_mul(4))
}

An EMA is “warm” (allowed to win the fallback hierarchy) once it has at least three samples (bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:126-128). Below three the deadline falls through to a coarser tier. The global prior receives the same observations with a smaller learning rate, (alpha_shift = 6, beta_shift = 5), so noisy single-peer samples do not dominate the cross-runtime estimate (rtt_tracker.rs:362).

RttTracker::estimate_budget_ns walks five tiers in order, stopping at the first warm hit:

  1. Per-edge (site, chain_id, hop_index) exact match.
  2. Per-site aggregate Jacobson over every round-trip to this site.
  3. Per-chain prior aggregated across every peer that has carried this chain’s traffic.
  4. Global prior across every round-trip in the runtime.
  5. The static NodeConfig.per_hop_budget_ns fallback.

Tiers 1 through 4 are the same RttEma type at successively coarser scopes; tier 5 is the constant the host configured at construction.

The tracker also runs a φ-accrual failure detector per site, fed by the same observe_round_trip path. The default PhiAccrualState keeps a rolling history of 1000 inter-arrival samples and ratchets through three states:

// from bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs:157-166
impl Default for PhiAccrualState {
    fn default() -> Self {
        Self {
            inter_arrival_history: std::collections::VecDeque::new(),
            history_capacity: 1000,
            threshold_phi: 8.0,
            down_phi: 16.0,
        }
    }
}

PhiState::Live (the default) is “healthy”. Suspect fires when the current φ crosses threshold_phi = 8.0; Down fires when φ crosses down_phi = 16.0. Between phases 5 and 6 of the poll cycle the engine calls RttTracker::scan_phi(now_ns); per-site state changes since the previous scan return as PhiTransition::Live, Suspect, or Down entries, which the engine maps onto the typed bus events InfraEvent::PeerLive, PeerSuspect, and PeerDown. The per-entry last_phi_state ratchet is what keeps the engine from re-emitting the same event on every poll while a peer stays silent.

Snapshot, restore, clear

The engine survives process restart through Node::snapshot and Node::restore. Snapshot captures the engine’s transient state plus every bound concrete’s per-component bytes:

// from bytesandbrains/bb-runtime/src/node/mod.rs:176-184
pub fn snapshot(&self) -> Result<crate::snapshot::NodeSnapshot, crate::errors::SnapshotError> {
    let queued = self.engine.bus.len();
    if queued > 0 {
        return Err(crate::errors::SnapshotError::BusNotDrained { queued, dropped: 0 });
    }
    Ok(self.snapshot_inner())
}

Snapshot refuses to proceed when the bus still carries un-drained events; the host drives Node::poll to quiescence first and retries. The captured NodeSnapshot round-trips through the host’s persistence layer (typically prost or bincode) and lands at Node::restore, which validates the spec version, reinstalls each captured graph, restores each component’s bytes via the inventory-registered restore_fn, replays the counters, the bus subscriptions, the address book, the peer governor, the backoff table, the pending-async entries, and the queued outbound envelopes, then bumps the incarnation counter. The restored Node resumes from exactly where the snapshot was taken.

Node::clear resets transient state without dropping bound components or installed graphs. Used in tests and for graceful recycling between test rounds.

A minimal poll loop

The end-to-end shape of a driven Node, copied from the canonical HNSW example:

// from bytesandbrains/examples/custom_index_hnsw.rs:307-336
let query: Vec<f32> = vec![2.0, 2.1, 2.2, 2.3];
let query_bytes = bincode::serialize(&query)?;
let _ = node.ingress_handle().push(IngressEvent::Invoke {
    module_name: artifact_module_name(&node),
    inputs: vec![("query".into(), query_bytes)],
    exec_id: bytesandbrains::ids::ExecId::from(0u64),
});

let waker = Waker::noop();
let mut cx = Context::from_waker(waker);
let mut total_steps = 0usize;
for _ in 0..40 {
    match node.poll(&mut cx) {
        std::task::Poll::Ready(steps) => {
            if steps.is_empty() {
                break;
            }
            total_steps += steps.len();
        }
        std::task::Poll::Pending => break,
    }
}

The example pushes an Invoke onto the ingress, then loops Node::poll against a no-op waker until the engine yields. Each ready cycle accumulates a vector of steps; the host’s production loop ships envelopes, surfaces app events, logs failures, and re-polls. The same drive pattern with a real waker on a tokio runtime is what the transport adapters use in production.

Run the example to see the async Index Contract dispatch end to end:

cargo run --example custom_index_hnsw --features test-components

Where this lives

  • bytesandbrains/bb-runtime/src/engine/core.rs: the Engine struct, the fields, the install-time setup methods, the dispatch registries.
  • bytesandbrains/bb-runtime/src/engine/poll.rs: the 8-phase poll cycle, the ingress router, the inbound envelope routing, handle_completion, handle_completion_failed, deadline expiry.
  • bytesandbrains/bb-runtime/src/engine/invoke.rs: invoke_one, function-call splice, role-dispatcher registry, backend-subgraph dispatch.
  • bytesandbrains/bb-runtime/src/engine/step.rs: the EngineStep enum and every observable variant.
  • bytesandbrains/bb-runtime/src/engine/pending_async.rs: PendingAsync and ExecutionState.
  • bytesandbrains/bb-runtime/src/engine/call_context.rs: the CallContext that backs bootstrap and FunctionCall dispatch.
  • bytesandbrains/bb-runtime/src/ingress.rs: IngressQueue, IngressEvent, IngressQueueRef.
  • bytesandbrains/bb-runtime/src/completion.rs: CompletionHandle, CompletionSink, ContractResponse.
  • bytesandbrains/bb-runtime/src/node/mod.rs: the Node struct, poll, bootstrap, deliver_inbound, deliver_event, invoke, snapshot, restore, the introspection surface.
  • bytesandbrains/bb-runtime/src/node/config.rs: NodeConfig, the defaults, the builder methods.
  • bytesandbrains/bb-runtime/src/framework/peer_state.rs, peer_governor.rs, backoff_table.rs: the PeerState bundle, BlockReason, PeerHealth, LifecycleTransition, DEFAULT_FAILURE_THRESHOLD, the backoff schedule defaults.
  • bytesandbrains/bb-runtime/src/framework/rtt_tracker.rs: RttEma (α = 1/8, β = 1/4), PhiAccrualState, PhiState, the five-tier fallback walk, the global-prior learning rate.
  • bytesandbrains/bb-runtime/src/framework/outbound_queue.rs: the head-eviction FIFO and the OutboundDropped counter.
  • bytesandbrains/src/install.rs: bb::install, Config, InstallError. The dedup walk lives at bytesandbrains/src/install.rs:524-571; the per-target FunctionKey queue lands on the engine via install_targets at bytesandbrains/src/install.rs:469-500.
  • bytesandbrains/examples/custom_index_hnsw.rs: the end-to-end worker-thread pattern with async Contract dispatch.
  • bytesandbrains/docs/ENGINE.md, bytesandbrains/docs/CONTRACT_DISPATCH.md: the bb-private architecture specs.