Authoring Modules and Components

Every bytesandbrains program walks the same path: author a Module, compile it against the concrete components it needs, install the compiled result, then poll the resulting Node. Chapter 4 covered the canonical syscalls a recording leans on, and Chapter 3 covered the recording surface that the author writes into. This chapter covers what the author actually authors. By the end you will know how to define a Module, declare its ports, implement a Contract trait, register a concrete component with the inventory, and reach for register_op! or register_protocol! when the Contract surface does not fit.

The three-phase pipeline

Every program travels through three phases against the same ModelProto:

Author: write a Module. The Module::build() default method walks body() (and bootstrap() if overridden) through a Graph recorder and returns one pre-compile ModelProto.
Compile: hand the proto to Compiler::new().bind_<role>::<T>("slot"), chain one bind_* call per slot, and call .compile(model). The compiler runs its pass pipeline, stamps the compilation passport and binding table onto the proto, and returns one compiled ModelProto.
Install: hand the compiled proto to install(peer_id, addresses, model, targets: &[&str], Config). The installer verifies the passport, parses the binding table per target, dedupes shared slot bindings across targets, constructs each bound concrete via the inventory exactly once, and returns a Node ready to poll(). Single-target installs pass &["MyModule"]; peers hosting multiple partitions (Client + Server on the same Node) pass &["Client", "Server"] and the install path shares one ComponentRef per slot the targets jointly declare.

Each phase reads and writes the same ModelProto type. There is no intermediate wrapper, no separate codec layer, no second IR. The output of one phase is the input of the next.

The shortest end-to-end skeleton:

// from bytesandbrains/examples/component_with_dependency.rs:148-176
let app = CountingApp { index: IndexSlot };
let model = app.build()?;

let compiled = Compiler::new()
    .bind_index::<CountingIndex>("primary_index")
    .bind_backend::<CpuBackend>("compute")
    .compile(model)?;

let target = compiled.functions[0].name.clone();
let mut node = install(
    PeerId::from(1u64),
    vec![Address::empty()],
    compiled,
    &[target.as_str()],
    Config::new(),
)?;

use bytesandbrains::prelude::*; is the canonical import bag for this skeleton. It pulls in Module, Graph, Output, Compiler, install, Config, Node, IngressEvent, the identity types (PeerId, Address), every Contract trait and matching derive (Index, Backend, Aggregator, Model, Codec, DataSource, PeerSelector, Concrete), plus CompletionHandle and ContractResponse. Reach into bytesandbrains::placeholders, bytesandbrains::ops, bytesandbrains::contracts, or bytesandbrains::runtime for surfaces the prelude omits (slot unit-structs like IndexSlot, the shipped CpuBackend, the internal RuntimeResourceRef, etc.).

The rest of this chapter expands each step.

Multi-role on one Node

A peer hosting both Client and Server partitions from one compile passes both names in install order:

// from bytesandbrains/examples/single_node_federated_learning.rs:343-349
let mut node = install(
    peer,
    addrs,
    compiled,
    &[client_target.as_str(), server_target.as_str()],
    Config::new(),
)?;

Mechanics:

The compiler emitted Client and Server as sibling partitions of the same ModelProto; both stamp binding.<target>.backend = "Backend|CpuBackend|<slot_id>" and binding.<target>.aggregator = "Aggregator|FedAvg|<slot_id>" so the install path’s dedup walk converges on one CpuBackend instance plus one FedAvg instance shared by both partitions (bytesandbrains/src/install.rs:267-324,524-571).
Bootstrap functions queue in slice order. Install records every module_phase = "bootstrap" FunctionProto on BootstrapState::install_order without arming the queue; the host calls node.run_bootstrap(BootstrapTarget::All) (or BootstrapTarget::ModuleNames(&["Client", "Server"]) when the host wants to drive specific targets with no staged inputs, or BootstrapTarget::ModuleRequests(&[BootstrapRequest]) when targets declare input formals) to drive every queued bootstrap to quiescence in slice order. Client’s bootstrap fires first, runs to quiescence, emits its BootstrapComplete step, then Server’s bootstrap fires. Single-target installs are the length-1 case of the same path.
Each target registers as its own Node::module_index entry, so node.deliver_event("Client", ...) and node.deliver_event("Server", ...) route to different entry-point graphs. Top-level outputs surface as EngineStep::AppEvent { module_name: "Client" | "Server", topic, value_bytes } so the host distinguishes which partition produced an event.
Targets that need distinct concrete instances for the same role (one FedAvg for Client, a separate FedAvg for Server) wire two slot names at compile time (bind_aggregator::<FedAvg>("client_agg") plus bind_aggregator::<FedAvg>("server_agg")). Two slots, two ComponentRefs, no sharing.

The slice form covers the single-target case verbatim. Length-1 slices produce length-1 BootstrapState::install_order entries, one entry in Node::module_index, and unchanged observable behaviour. Pre-1.0 means there is no install_single shim: every caller passes the slice form.

The Module trait

Module is the user-authored unit of composition. Implement two methods (name, body) and accept the framework’s defaults for the rest. The trait lives in bb_dsl::module and is re-exported as bytesandbrains::Module.

// from bytesandbrains/bb-dsl/src/module.rs:117-152
pub trait Module {
    /// Short stable identifier — becomes `FunctionProto.name`.
    fn name(&self) -> &str;

    /// User-implemented recording logic. Declare inputs via
    /// `g.input("name")`; emit outputs via `g.output(name, value)`
    /// or `g.net_out(name, peers, value)`. Compose child Modules
    /// via `self.child.call().input(...).build(g).output(...)`.
    fn body(&self, g: &mut Graph);

    /// Setup-phase recording. Defaults to a no-op so Modules with
    /// no setup take the empty body. Authors override to seed the
    /// address book, mint long-lived constants, allocate ingress
    /// queues, or schedule the first timer.
    fn bootstrap(&self, _g: &mut Graph) {}

    // ...
}

name() returns a stable string the compiler uses as FunctionProto.name. Two Modules with the same name() share a function entry, so the string is the composition identity, not a display label.

body() is the recording entry point. It receives &mut Graph and records DSL calls into it. The body returns nothing. Outputs leave the body through g.output("name", value) for local sinks or g.net_out("name", peers, value) for network sinks. Inputs enter through g.input("name").

bootstrap() defaults to an empty body. Override it when the Module needs setup work that must run once before the first body poll. Typical bootstrap work: seeding the address book, minting long-lived constants, registering an inbound port for asynchronous arrivals, scheduling a first timer. Install does NOT auto-fire the bootstrap. The host kicks it explicitly via Node::run_bootstrap(BootstrapTarget::All) (no formals, install-order), Node::run_bootstrap(BootstrapTarget::ModuleNames(&["X", "Y"])) (named targets, no inputs), or Node::run_bootstrap(BootstrapTarget::ModuleRequests(&[BootstrapRequest])) when the override declares inputs via g.input(name). The per-component is_op_locked gate parks body ops whose touched ComponentRef falls inside the in-flight bootstrap’s touch set until the bootstrap drains; disjoint components keep firing. See The Engine for the BootstrapState architecture and BootstrapRequest validation.

The federated learning example shows the canonical body shape:

// from bytesandbrains/examples/federated_learning.rs:114-141
struct ClientLogic;
impl Module for ClientLogic {
    fn name(&self) -> &str {
        "ClientLogic"
    }
    fn body(&self, g: &mut Graph) {
        // Inbound from the server: the latest global model params.
        let server_params = g.input("server_params");

        // Apply the global params to the local model.
        let _ = ModelSlot.load_parameters(g, server_params);

        // Local training step.
        let (batch, _labels) = DataLoaderSlot.next_batch(g);
        let _prediction = ModelSlot.forward(g, batch);

        // Read the updated params and ship them to the server peer.
        let updated_params = ModelSlot.params(g);
        let server_peer = g.input("server_peer");
        g.net_out("updated_params", server_peer, updated_params);
    }
}

Module::build() is the framework-provided method that turns the trait impl into a ModelProto. It records body into the proto’s functions[0] stamped with the canonical module_phase = "body" key, and records bootstrap into a sibling <Name>__bootstrap function stamped with module_phase = "bootstrap" when the author overrode it. Sub-Modules reached during recording become entries in functions[1..].

Composing sub-Modules with ModuleCall

A parent Module embeds a child Module as a field and inlines the child’s body into its own recording through the call() builder. The fluent surface is three methods deep: call() opens a ModuleCall, .input(name, handle) binds a named port to a value the parent produced, and .build(g) records the child’s body into the parent’s graph and returns a ModuleOutputs handle the parent pulls each declared output from by name.

// from bytesandbrains/bb-dsl/src/module.rs:41-83
pub struct ModuleCall<'a, M: ?Sized + Module> {
    module: &'a M,
    bound_inputs: std::vec::Vec<(&'static str, crate::output::Output)>,
}

impl<M: ?Sized + Module> ModuleCall<'_, M> {
    pub fn input(mut self, name: &'static str, handle: crate::output::Output) -> Self { /* ... */ }
    pub fn build(self, g: &mut crate::graph::Graph) -> ModuleOutputs<'_> { /* ... */ }
    pub fn bootstrap(self, g: &mut crate::graph::Graph) -> ModuleOutputs<'_> { /* ... */ }
}

The default Module::call opens an empty ModuleCall borrowing &self. .input(name, handle) is non-consuming style: each call appends one binding and returns Self, so chains stack any number of inputs before the terminal .build(g).

.build(g) is the body-side seam. The child’s body() runs against the parent’s graph; the child’s g.input("name") calls inside that recording resolve against the bound inputs from the parent’s .input(name, handle) chain. Outputs the child registered through g.output("name", value) or g.net_out("name", peers, value) surface as ModuleOutputs::output("name") handles the parent then wires into its own downstream ops. The parent and child share the graph’s frontier, so independent branches inside the child’s body fire as soon as their own inputs are ready instead of blocking on a single CALL barrier.

.bootstrap(g) is the parallel seam for the bootstrap recording. Inside a parent’s bootstrap() override, a self.child.call().bootstrap(g) call records the child’s <Name>__bootstrap as a sibling function and emits a CALL NodeProto. The engine routes the call through the standard FunctionCall path, allocates a fresh ExecId for the child’s body, and gates parent body-phase ops until the child’s bootstrap descendants drain.

Two parent Modules embedding the same child share that child’s recorded body. The framework keys composition by name(): identical name() strings collapse to one FunctionProto in the emitted ModelProto. Compose the same Sum Module under two parents and the emitted ModelProto carries functions[0] = Parent, functions[1] = Sum (one entry, not two), functions[2] = OtherParent.

The multi-target network example wires three leaf Modules into a single parent that hands one’s output as another’s input:

// from bytesandbrains/examples/multi_target_network.rs:106-138
fn body(&self, g: &mut Graph) {
    let sink_peers = g.input("sink_peers");

    // LoaderLeaf has no inputs, exposes `batch` as an output.
    let loader_outs = self.loader.call().build(g);
    let batch = loader_outs.output("batch");

    // TrainerLeaf consumes `batch`, exposes `prediction`.
    let trainer_outs = self.trainer.call().input("batch", batch).build(g);
    let prediction = trainer_outs.output("prediction");

    // Ship the prediction over the wire.
    g.net_out("prediction_port", sink_peers, prediction);

    // SinkLeaf consumes the inbound payload + a typed metadata channel.
    let received = g.lookup_output("prediction_port").expect("net_out port");
    let _ = self
        .sink
        .call()
        .input("contribution", received.clone())
        .input("metadata", received)
        .build(g);
}

The parent’s job is wiring. Each call().build(g) line records one child’s body in place. Each .input(name, handle) line binds one of the child’s g.input("name") lookups to a value the parent already produced. The leaves stay focused on their own DSL surface; the composition lives in one place.

Declaring ports through the recorded body

A Module’s port set is whatever its body() recording touches. There is no struct-attribute declaration step. The four port verbs the body can use are g.input(name) (an input the parent or host binds), g.output(name, value) (a local sink the parent or host reads), g.lookup_output(name) (a back-reference to a port the body itself already emitted), and g.net_out(name, peers, value) (a network sink that ships the value over the wire to the given peers).

The compiler infers the port set from the recorded body when Module::build() walks it: g.input calls become input ports, g.output calls become local output ports, and g.net_out calls become network output ports. The verb is what decides the kind. The compiler’s TypeSolver then infers types by walking the ops connected to each port; the network-boundary verification pass refuses any program where a wire op sits outside a net_out port.

The same InferenceCell shape from the federated-learning examples, recorded:

struct InferenceCell;
impl Module for InferenceCell {
    fn name(&self) -> &str { "InferenceCell" }
    fn body(&self, g: &mut Graph) {
        let query = g.input("query");
        let incoming_grad = g.input("incoming_grad");
        let response = compute_response(g, query, incoming_grad);
        g.output("response", response);
        let prediction = compute_prediction(g, query);
        let sink_peers = g.input("sink_peers");
        g.net_out("ship_pred", sink_peers, prediction);
    }
}

The body declares four ports without naming them as attributes: two inputs (query, incoming_grad), one local output (response), and one network output (ship_pred). g.net_out is what makes the ship_pred port a network port; swap it for g.output and the same name becomes a local port. No struct-attribute declarations are needed, and the same Module impl is the only authoring surface.

Implementing a Contract trait

Concrete components implement one or more user-facing Contract traits from bytesandbrains::contracts. The seven method-style Contracts are Index, Aggregator, Model, Codec, DataSource, PeerSelector, and Backend. Each declares one method per atomic op the role defines.

Memory boundaries

Components never see allocation failures from wire or app ingress. The framework’s boundary callers (Engine::decode_typed_fill, Node::deliver_event, Node::invoke, CompletionSink::complete) cap, charge against NodeConfig::ingress_byte_budget, and fallibly reserve framework-owned storage BEFORE handing payloads to a Contract method. An allocation failure surfaces as InfraEvent::WireReceiveError::AllocationFailed (wire) or InfraEvent::AppIngressError::AllocationFailed (app) on the bus. The offending bytes drop at the boundary and the Contract method never runs. See the engine ingress boundaries section for the full contract.

Inside a Contract method, normal Rust allocation patterns apply. Components are designed to play by the runtime contract: a Vec::push that runs out of memory is a process abort, not a framework-handled failure. Components needing graceful degradation under memory pressure handle that inside their own implementation.

Backend authors own tensor materialization budget. Wire bytes land inside Backend::materialize_from_wire(type_hash, bytes: Vec<u8>) (bb-runtime/src/contracts/backend.rs:497). The framework has already charged bytes.len() against the ingress byte budget and moved ownership of the Vec<u8> into the call. The backend chooses the materialization strategy: zero-copy adoption via ArrayD::from_shape_vec, pool-pulled buffer with copy in, or fresh-allocate. Returning Err drops the fill, releases the charge, and emits WireReceiveError::BackendMaterializeFailed. See the bb::Backend backend-owned tensor memory subsection for the lifecycle.

Each Contract method takes three relevant arguments past self: ctx (the per-dispatch runtime resource handle), the typed inputs, and a CompletionHandle<R, E>. Each method returns ContractResponse<R, E>:

// from bytesandbrains/bb-runtime/src/completion.rs:71-76
pub enum ContractResponse<R, E> {
    /// Result is ready inline. The `CompletionHandle` passed to the
    /// method was NOT used; drop it.
    Now(Result<R, E>),
    /// Implementation retained the handle for off-thread completion.
    Later,
}

The framework bridge consumes the return:

Now(Ok(value)) becomes DispatchResult::Immediate(serialize(value)). The engine skips the park-and-ingress cycle and proceeds.
Now(Err(e)) becomes the dispatch error, propagated through the engine’s typed error path.
Later becomes DispatchResult::Async(handle.cmd_id()). The engine parks the dispatched op until the impl calls handle.complete(result) from off-thread (a worker thread, a Tokio task, a remote RPC).

The HNSW worker example wires every Contract method through Later:

// from bytesandbrains/examples/custom_index_hnsw.rs:193-234
impl Index for HnswIndex {
    type Vector = [f32];
    type Error = HnswError;

    fn add(
        &mut self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        vec: &Self::Vector,
        completion: CompletionHandle<u64, Self::Error>,
    ) -> ContractResponse<u64, Self::Error> {
        self.send(WorkItem::Add {
            vec: vec.to_vec(),
            completion,
        });
        ContractResponse::Later
    }

    fn search(
        &self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        query: &Self::Vector,
        k: u32,
        completion: CompletionHandle<Vec<(u64, f32)>, Self::Error>,
    ) -> ContractResponse<Vec<(u64, f32)>, Self::Error> {
        self.send(WorkItem::Search {
            query: query.to_vec(),
            k,
            completion,
        });
        ContractResponse::Later
    }

    fn remove(
        &mut self,
        _ctx: &mut bytesandbrains::runtime::RuntimeResourceRef<'_>,
        _id: u64,
        completion: CompletionHandle<(), Self::Error>,
    ) -> ContractResponse<(), Self::Error> {
        self.send(WorkItem::Remove { completion });
        ContractResponse::Later
    }
}

Each method ships the typed handle to the worker thread, returns Later, and the worker calls handle.complete(result) once the work finishes. The next call to node.poll() drains the ingress queue and unparks the suspended op with the result.

The Vector and Error associated types let the impl pick its storage layout and error vocabulary at the role surface. Chapter 8 covers how Vector: Storage slots the impl into the type tree.

Authoring a trainable Index

An IVF or PQ index needs a calibration pass before the first add or search. The Index Contract carries an optional train method that defaults to Now(Ok(())) so flat indexes pay nothing for it. A trainable impl overrides the default:

// from bytesandbrains/tests/index_train_lifecycle.rs:204-249
#[derive(Default, Clone, serde::Serialize, serde::Deserialize, Concrete, Index)]
struct CountingTrainIndex {
    capacity: u32,
}

impl bytesandbrains::Index for CountingTrainIndex {
    type Vector = [f32];
    type Error = std::convert::Infallible;

    fn add(/* ... */) -> ContractResponse<u64, Self::Error> {
        ContractResponse::Now(Ok(0))
    }
    fn search(/* ... */) -> ContractResponse<Vec<(u64, f32)>, Self::Error> {
        ContractResponse::Now(Ok(vec![]))
    }
    fn remove(/* ... */) -> ContractResponse<(), Self::Error> {
        ContractResponse::Now(Ok(()))
    }

    fn train(
        &mut self,
        _ctx: &mut RuntimeResourceRef<'_>,
        _samples: &[&[f32]],
        _c: CompletionHandle<(), Self::Error>,
    ) -> ContractResponse<(), Self::Error> {
        // IVF: run k-means on _samples, stash centroids on `self`.
        // PQ: per-subspace k-means, stash M codebooks on `self`.
        ContractResponse::Now(Ok(()))
    }
}

The #[derive(bb::Index)] bridge emits the Train op in the runtime’s atomic_opset() automatically. The Module body records a calibration call via IndexSlot::train(g, samples). The recorded NodeProto rides under ai.bytesandbrains.role.index with a TYPE_TRIGGER output; placing the call in Module::bootstrap gates the body frontier on training completion before any add/search fires. The same pattern applies to a trainable Codec: override fn train on the Codec Contract impl, record a calibration via CodecSlot::train(g, samples), and bootstrap-gate downstream Encode/Decode calls on the returned trigger.

Authoring a Component-level bootstrap

When the one-shot setup needs Rust code rather than recorded graph ops, override bb::Bootstrap alongside the primary Contract. Every #[derive(bb::Concrete)] type already participates in the Component bootstrap dispatch path via the trait’s no-op default. Override to allocate pools, mmap state, prime calibration caches, or dial seed peers.

use bytesandbrains::contracts::bootstrap::{Bootstrap, BootstrapCtx};

#[derive(bb_derive::Concrete, bb_derive::Backend)]
#[bootstrap_override]
struct PinnedHostBackend {
    pool: HostBufferPool,
}

impl Bootstrap for PinnedHostBackend {
    type Error = AllocError;

    fn bootstrap(&mut self, _ctx: &mut BootstrapCtx) -> Result<(), AllocError> {
        // One-shot pinned-buffer pool allocation. Body-phase
        // kernels read through `self.pool`.
        self.pool.prime(/* config */)?;
        Ok(())
    }
}

#[bootstrap_override] on the struct (bytesandbrains/bb-derive/src/parse.rs:36-48) suppresses the derive’s default no-op impl so the hand-written one does not collide. The derive still emits the BootstrapDispatcherRegistration inventory entry, so install() wires the dispatcher (bytesandbrains/src/install.rs:451-466) without naming the type at the call site.

The host fires the override explicitly through the slot the binding chain bound the concrete onto:

node.run_bootstrap(BootstrapTarget::Slots(&["compute"]))?;
// or batch:
node.run_bootstrap(BootstrapTarget::Slots(&["compute", "primary_index"]))?;

Prefer Component bootstrap over Module bootstrap when:

The setup is Rust-side state (buffer pools, file handles, mmap regions, kernel caches) that no body op needs to see as a graph value.
The setup is per-instance (one Component, one initialization) rather than per-Module-target. The slot granularity matches the resource lifetime exactly.
The setup needs the broader Rust runtime (system calls, filesystem access, GPU context creation), and recording it inside Module::bootstrap would force a placeholder syscall that ultimately reaches Component code anyway.

Prefer Module bootstrap when the setup must compose with the graph: an Index::train(samples) call needs a DataSource to produce the samples, and the recording in Module::bootstrap expresses that composition naturally. Module bootstrap also wins when the setup spans several Components. Record one Module::bootstrap body that orchestrates them, rather than dispatching N Component bootstraps and re-implementing the composition outside the IR.

#[derive(Concrete)] and the inventory

A concrete component carries two derives: one for the universal component plumbing, one per role it plays.

// from bytesandbrains/examples/custom_index_hnsw.rs:66-71
#[derive(Default, Clone, Serialize, Deserialize, Concrete, Index)]
struct HnswIndex {
    capacity: u32,
    #[serde(skip)]
    tx: Arc<Mutex<Option<mpsc::Sender<WorkItem>>>>,
}

#[derive(bb::Concrete)] emits three things in one expansion:

impl ConcreteComponent for #struct_ident: sets the canonical TYPE_NAME, the Application package, an Infallible error type, and a unit Config. The default new(&()) calls Self::default(). Serialize and restore go through bincode’s serde adapters.
impl AnyComponent for #struct_ident: the erased-component plumbing the engine needs to hold a typed instance behind a trait object.
One inventory::submit!{ ConcreteComponentRegistration { ... } } block. The carrier records the type name, the package tag, the serialize and restore function pointers, the construct function pointer, and the declared dependencies slice.

#[derive(bb::<Role>)] (one per role the concrete plays) emits the bridge from the user-facing Contract trait to the engine-internal <Role>Runtime trait the engine dispatches through. The bridge generates one dispatch_atomic arm per Contract method, downcasts inputs from &dyn SlotValue, opens a typed completion handle, and calls the user’s Contract method. The derive also submits a ComponentRoleBinding and a DispatcherRegistration to the inventory so the install path can look up the dispatcher by (TYPE_NAME, role) without scanning trait impls.

A struct that needs a non-trivial Config or a fallible constructor writes impl ConcreteComponent for Self { ... } by hand instead of using the derive. The role bridge derives stay usable in that case.

The seven method-style role derives are #[derive(bb::Index)], #[derive(bb::Aggregator)], #[derive(bb::Model)], #[derive(bb::Codec)], #[derive(bb::DataSource)], #[derive(bb::PeerSelector)], and #[derive(bb::Backend)]. There is no #[derive(bb::Protocol)]: protocol authoring goes through register_protocol!{} instead.

Declaring sibling dependencies

A concrete declares the slots it reads at dispatch time through a #[depends(...)] attribute on the struct. The Concrete derive parses the attribute into ConcreteComponent::DEPENDENCIES. The compiler walks every bound component’s DEPENDENCIES slice and refuses to compile a model where a declared slot is not bound to a sibling of the right role.

// from bytesandbrains/examples/component_with_dependency.rs:47-51
#[derive(Clone, Default, Serialize, Deserialize, Concrete, Index)]
#[depends(backend = "compute")]
struct CountingIndex {
    bias: u32,
}

The role names accepted inside #[depends(...)] are the snake_case forms the author already writes in the derive list: index, aggregator, model, codec, data_source, peer_selector, backend, and protocol. Multiple slots stack across attributes or within one attribute. Compile-time errors surface as CompileError::UnboundDependency or CompileError::DependencyRoleMismatch.

At dispatch time, the Contract impl reaches the bound sibling through the RuntimeResourceRef passed in ctx:

// from bytesandbrains/examples/component_with_dependency.rs:88-105
fn search(
    &self,
    ctx: &mut RuntimeResourceRef<'_>,
    query: &Self::Vector,
    _k: u32,
    _c: CompletionHandle<Vec<(u64, f32)>, Self::Error>,
) -> ContractResponse<Vec<(u64, f32)>, Self::Error> {
    let backend = ctx
        .dependency::<CpuBackend>("compute")
        .expect("compiler-verified `compute` slot resolves to a CpuBackend");
    let len = query.len().max(1);
    let q = cpu_constant(backend, query.first().copied().unwrap_or(0.0), len);
    let bias = cpu_constant(backend, self.bias as f32, len);
    let _sum = backend
        .add(&q, &bias)
        .expect("CpuBackend::add on equal-shape f32");
    ContractResponse::Now(Ok(Vec::new()))
}

The .expect(...) on ctx.dependency::<T>("compute") is the intended call site: the lookup is total once Compiler::compile succeeds, so the only failure path is a programmer error in the binding table that the compiler already refused. Chapter 7 covers the dependency surface in detail.

register_op! for custom atomic ops

When a custom op does not fit a role Contract (a syscall-style side-effect, a domain-specific primitive), register the invoke function directly with bb::register_op!{}. The macro emits one inventory::submit!{ OpRegistration { ... } }. The engine consumes the registry during install and routes every NodeProto whose (domain, op_type) matches to your invoke function.

// from bytesandbrains/tests/derive_smoke.rs:76-88
fn invoke_demo_op(
    _: &NodeProto,
    _: &[(&str, &dyn SlotValue)],
    _: &mut RuntimeResourceRef<'_>,
) -> Result<DispatchResult, OpError> {
    Ok(DispatchResult::Immediate(Vec::new()))
}

bytesandbrains::register_op! {
    domain: "test.derive_smoke",
    op_type: "DemoOp",
    invoke: invoke_demo_op,
}

The invoke function takes the NodeProto the engine is dispatching, the positional input slot values, and the per-dispatch runtime resource handle. It returns DispatchResult::Immediate(outputs) for inline results or DispatchResult::Async(cmd_id) to park the op until a matching completion lands on the ingress queue.

The macro accepts three named fields. domain and op_type together form the engine’s dispatch key. invoke names a function in scope. A custom op submitted this way works without any derive on the containing struct: the function pointer goes straight to the inventory under the Custom registration kind.

register_protocol! for the Protocol role

The Protocol role differs from the other seven: its atomic opset is shaped by the protocol, not by a fixed method-per-op Contract. There is no #[derive(bb::Protocol)]. The bb::register_protocol!{} declarative macro is the authoring surface.

// from bytesandbrains/tests/derive_smoke.rs:99-110
bytesandbrains::register_protocol! {
    #[derive(Clone, Debug, Default, Serialize, Deserialize)]
    pub struct DemoProtocol {
        pub seed: u64,
    }
    domain: "test.demo_protocol"
    version: 1
    ops {
        Ping,
        FindNode,
    }
}

The macro emits the struct, the universal triple (ConcreteComponent, AnyComponent, inventory submission), and a ProtocolRuntime impl whose atomic_opset() lists every declared op and whose dispatch_atomic returns Immediate(vec![]) for each one as a placeholder. When the per-op arms need real logic, the author skips the macro and hand-writes the ProtocolRuntime impl directly. The macro covers boilerplate; the semantics live in the author’s code.

Inventory and dead-code elimination

Every concrete component, every role dispatcher, and every custom op goes through inventory::submit!. Each #[derive(bb::Concrete)], each #[derive(bb::<Role>)], each register_op!{}, and each register_protocol!{} expansion emits one or more submit! blocks. The global registry collects every submitted entry at process start.

Rust’s linker DCE strips submissions whose containing crate is never referenced by a function symbol. That is the framework’s binary-size story: components in unreferenced crates pay nothing at runtime. It is also a footgun: a transitively-depended-on rlib whose entries are never touched by user code will have its submit! blocks DCE’d along with the rest of its object files.

The framework anchors its own components with a single link_force() call inside install. The first thing the install path does is call bb_ops::link_force(), which black-boxes one function pointer per inventory-bearing module in bb-ops. That anchors every framework submission across the rlib boundary.

User crates that ship their own components and worry about DCE follow the same pattern: export a link_force() function that black_box(fn as usize)s one function pointer per submit!-bearing module, and call it once from user code before any inventory lookup.

Where this lives

Module trait, Module::build, and bootstrap recording: bytesandbrains/bb-dsl/src/module.rs.
Port verbs on the recorder (Graph::input, Graph::output, Graph::net_out, Graph::lookup_output): bytesandbrains/bb-dsl/src/graph.rs.
#[derive(bb::Concrete)] plus the seven role derives: bytesandbrains/bb-derive/src/lib.rs and bytesandbrains/bb-derive/src/roles.rs.
Universal triple codegen (ConcreteComponent, AnyComponent, inventory submission): bytesandbrains/bb-derive/src/codegen_shared.rs.
register_op! and register_protocol! grammars and emitters: bytesandbrains/bb-derive/src/parse.rs.
ContractResponse and CompletionHandle: bytesandbrains/bb-runtime/src/completion.rs.
The seven Contract traits: bytesandbrains/bb-runtime/src/contracts/.
Inventory carriers (ConcreteComponentRegistration, OpRegistration, ComponentRoleBinding, DispatcherRegistration): bytesandbrains/bb-ir/src/registry.rs and bytesandbrains/bb-runtime/src/registry.rs.
install entry point and DCE anchoring: bytesandbrains/src/install.rs.
The HNSW worker and dependency walker examples used in this chapter: bytesandbrains/examples/custom_index_hnsw.rs, bytesandbrains/examples/component_with_dependency.rs.