Deployment
Chapter 11 closed with the engine emitting EngineStep::SendEnvelope
on its outbound seam and reading IngressEvent::EnvelopeFrom on its
inbound seam. This chapter takes those two seams and answers a
question the previous chapters deferred: how do you actually run a
program on more than one peer?
A compiled ModelProto is a single artifact. It carries every
partition the compiler produced, every binding the build phase chose,
and the passport that gates install. The deployment story is how the
host turns that one artifact into a running cohort. The framework
supports two patterns. Both terminate at the same install call. Both
work in-process for tests, in separate processes for multi-host
simulation, and across the wire once a transport adapter exists.
This release ships transport-less. The Node is sans-IO, the engine
emits envelope steps the host hands to whatever messaging layer it
prefers, and the public examples wire their own in-process bus. A
bb-libp2p crate that adapts Address to libp2p multiaddrs and the
IngressQueue to libp2p messaging is not part of this release; the
shape below is the integration surface any such adapter plugs into.
The two deployment patterns
Form 1: per-binary placeholder authoring. The shared crate holds only
role placeholders (BackendSlot, IndexSlot, ModelSlot, …). Each
per-device binary crate compiles its own ModelProto from the shared
Module, binds its concretes via Compiler::new().bind_<role>::<T>,
and installs through bb::install. The linker only ever sees the
concretes each binary actually uses. Vanilla Rust DCE strips
everything the binary did not reference, so a client binary never
ships server-only kernels.
Form 2: shared compiled model. The build host runs Module::build
followed by Compiler::new().bind_*().compile(model) once. The
compiler emits one ModelProto whose functions[] carries every
partition root plus the binding metadata for each target. Each
device receives the same prost-encoded bytes and selects its
partition by naming the function at install time.
Pick Form 1 when per-device binaries can be built independently and
each binary owns its concrete bindings. Pick Form 2 when the build
host owns binding configuration (build-host secrets, pre-validated
graphs, cross-language deployment). The two patterns compose: a
Module can use a BackendSlot placeholder for the slot the user
binds per-device alongside a concrete the build host chose.
Form 1: per-binary placeholder authoring
The workspace layout puts the Module struct in a shared crate and one
binary crate per device class. Each binary depends on shared plus
its own concrete-Component crate.
my_app/
├── Cargo.toml # workspace
├── shared/ # Module struct + custom message types
│ └── src/lib.rs
├── server/ # bin crate: shared + server_components
│ └── src/main.rs
└── client/ # bin crate: shared + client_components
└── src/main.rs
The shared crate names the slots the Module reads and writes. The
placeholder structs are zero-sized and live in the bb::placeholders
re-export, which the prelude does not pull in. Embedding them as
fields on the Module struct is the pattern the framework’s examples
use.
// from bytesandbrains/examples/federated_learning.rs:114-141
struct ClientLogic;
impl Module for ClientLogic {
fn name(&self) -> &str {
"ClientLogic"
}
fn body(&self, g: &mut Graph) {
let server_params = g.input("server_params");
let _ = ModelSlot.load_parameters(g, server_params);
let (batch, _labels) = DataLoaderSlot.next_batch(g);
let _prediction = ModelSlot.forward(g, batch);
let updated_params = ModelSlot.params(g);
let server_peer = g.input("server_peer");
g.net_out("updated_params", server_peer, updated_params);
}
}
This Module reads two named inputs, reads from the bound Model and
DataSource slots, and writes one network port. The body never
names a concrete. Whatever concrete the binary binds at compile time
will be the one the engine dispatches into at runtime.
The binary crate completes the picture. It calls Module::build,
binds the concretes it actually links, compiles, and installs.
// from bytesandbrains/examples/federated_learning.rs:316-441
let client_proto: ModelProto = ClientLogic.build()?;
let client_artifact = Compiler::new()
.bind_protocol::<GlobalRegistryClient>("discovery")
.bind_model::<StubModel>("model")
.bind_data_source::<StubLoader>("data")
.compile(client_proto)?;
let client_peer = PeerId::from(CLIENT_PEER);
let client_target = client_artifact.functions[0].name.clone();
let client_addrs = vec![Address::empty().p2p(client_peer)];
let mut client = install(
client_peer,
client_addrs,
client_artifact,
&[client_target.as_str()],
Config::new(),
)?;
The install call returns a fully assembled Node. By the time the
function returns, the compilation passport has been verified, the
binding table has been parsed, every bound concrete has been
constructed via its inventory entry, and the target function is
installed as the root graph. The engine is ready to dispatch on the
first Node::poll.
// from bytesandbrains/src/install.rs:237-243
pub fn install(
peer_id: PeerId,
addresses: Vec<Address>,
model: ModelProto,
targets: &[&str],
config: Config,
) -> Result<Node, InstallError>
The signature takes the peer’s identity, its local listen-address
bag, the compiled model, the names of the functions this binary
runs, and a Config bag for slot-keyed per-concrete configs. The
address bag is a Vec because a real peer often advertises several
endpoints (LAN, WAN, relay) and the engine’s wire path pulls from
the bag at send time. The targets slice is non-empty: a single
install can stand up multiple partitions on one peer (a Server and
its ServerReduce, for example) by naming each target function once.
Concretes whose Config associated type is () need no entry;
install supplies &() automatically.
Cross-compilation is vanilla Rust. Each binary is a normal crate.
Standard cargo build --target=aarch64-unknown-linux-gnu for ARM,
--target=wasm32-unknown-unknown for WASM-capable targets, and so
on. The framework’s runtime dependencies (prost, serde,
bincode, concurrent-queue, atomic-waker, tracing) are pure
Rust and cross-compile cleanly to any std target.
Form 2: shared compiled model
The build host runs the shared Module through Module::build and
Compiler::compile once. Every concrete the deployment needs across
all targets is bound at the build host. The compiler emits one
ModelProto whose functions[] carries every partition root and
whose metadata_props carries the binding metadata.
// build host
let model = ClientLogic.build()?;
let compiled = Compiler::new()
.bind_protocol::<GlobalRegistryClient>("discovery")
.bind_model::<StubModel>("model")
.bind_data_source::<StubLoader>("data")
.compile(model)?;
std::fs::write("dist/fed_demo.model", compiled.encode_to_vec())?;
ModelProto is prost-generated. encode_to_vec() ships from
prost::Message. The artifact is plain protobuf bytes. No envelope
wrapper, no codec layer beyond prost.
The device binary decodes those bytes and installs by naming the function it runs.
// device binary
use bytesandbrains::proto::onnx::ModelProto;
use prost::Message;
let bytes = std::fs::read("dist/fed_demo.model")?;
let compiled = ModelProto::decode(bytes.as_slice())?;
let mut node = install(
PeerId::from(1u64),
vec![Address::empty()],
compiled,
&["ClientLogic"],
Config::new(),
)?;
The round-trip is exact. The framework’s own tests pin it.
// from bytesandbrains/tests/install_passport.rs:102-127
fn compiled_model_round_trips_through_prost() {
let mut model = empty_fixture("RoundTrip");
bb_compiler::stamp_for_test(
&mut model,
&[(
"compute",
"Backend",
"bytesandbrains::backends::cpu::CpuBackend",
)],
);
let bytes = model.encode_to_vec();
let decoded = ModelProto::decode(bytes.as_slice()).expect("prost round-trip");
let node = install(
PeerId::from(1u64),
vec![Address::empty()],
decoded,
&["RoundTrip"],
Config::new(),
)
.expect("install accepts the round-tripped model");
}
How install resolves a binding
The compilation passport and the binding metadata travel as
metadata_props entries on the ModelProto. The compiler stamps
two flavors of key. The first asserts that the model came out of the
compiler.
metadata_props["ai.bytesandbrains.compiled"] = "v1"
The second carries one entry per (target, slot) pair.
metadata_props["ai.bytesandbrains.binding.<target>.<slot>"]
= "<role>|<TYPE_NAME>|<slot_id|-1>"
install reads both. If the compiled passport is absent the call
returns InstallError::NotCompiled so a host that accidentally
hands the bare output of Module::build() to install fails
loudly. If the passport version disagrees with the framework’s
current COMPILED_CURRENT_VERSION the call returns
InstallError::IncompatibleCompiledVersion. Both gates run before
any binding is touched.
// from bytesandbrains/src/install.rs:70-151
pub enum InstallError {
NotCompiled,
IncompatibleCompiledVersion { got: String, expected: &'static str },
UnknownTarget { target: String, available: Vec<String> },
InvalidBindingTable { key: String, detail: String },
UnregisteredConcrete { type_name: String },
MissingConfig { slot: String, type_name: String },
ConfigTypeMismatch { slot: String, type_name: String, detail: String },
ConstructionFailed { slot: String, type_name: String, detail: String },
SlotBindingConflict { slot: String, conflicts: Vec<(String, String, String)> },
EmptyTargets,
}
UnregisteredConcrete is the one the host actually hits when a
device binary links the wrong concrete crate. It names the first
TYPE_NAME the artifact references that no inventory::submit!
carrier in this binary registers. The fix is to link the missing
concrete’s crate (and its link_force() helper, if any) into the
binary so the inventory entry survives DCE.
SlotBindingConflict and EmptyTargets cover the multi-target
install path. EmptyTargets fires when the targets slice is empty,
which is always programmer error. SlotBindingConflict fires when
two of the named targets bind the same slot to different
(TYPE_NAME, role) pairs: a shared slot has to resolve to one
ComponentRef, so the bindings must agree. The error lists every
(target, type_name, role) contributor so the host can see which
target’s binding chain disagreed.
TYPE_NAME is the artifact’s lookup key, set by the
#[derive(bb::Concrete)] derive. Bumping it is a breaking change
for artifact compatibility. The same contract ONNX operator names
follow. Pin it as a stable identifier and version-tag through
naming conventions like myapp::v1::ServerBackend.
install resolves each entry of the targets slice against
model.functions[]. Exact name matches first. The compiler’s
content-hash naming pass suffixes each partition root with
#<hash> for snapshot stability; install accepts either the bare
name or the suffixed name as a prefix match.
// from bytesandbrains/src/install.rs:358-375
fn find_target<'a>(model: &'a ModelProto, target: &str) -> Result<&'a FunctionProto, InstallError> {
if let Some(exact) = model.functions.iter().find(|f| f.name == target) {
return Ok(exact);
}
let prefix = format!("{target}#");
if let Some(suffixed) = model.functions.iter().find(|f| f.name.starts_with(&prefix)) {
return Ok(suffixed);
}
let available = model
.functions
.iter()
.map(|f| f.name.clone())
.collect::<Vec<_>>();
Err(InstallError::UnknownTarget {
target: target.to_string(),
available,
})
}
Single-process: install one Node and poll
The minimum viable deployment is one Node in one process. Build the
Module, compile against the local concretes, install, and drive
Node::poll. The federated learning example shows the whole loop in
one file.
// from bytesandbrains/examples/federated_learning.rs:423-485
let client_peer = PeerId::from(CLIENT_PEER);
let client_addrs = vec![Address::empty().p2p(client_peer)];
let mut client = install(
client_peer,
client_addrs,
client_artifact,
&[client_target.as_str()],
Config::new(),
)?;
let waker = Waker::noop();
let mut cx = Context::from_waker(waker);
let mut total = 0usize;
for _ in 0..3 {
match client.poll(&mut cx) {
std::task::Poll::Ready(steps) => {
if steps.is_empty() {
break;
}
total += steps.len();
}
std::task::Poll::Pending => break,
}
}
Node::poll returns Poll::Pending when the engine drains to
quiescence. The ingress queue registers the supplied Waker on the
pending path; an external thread pushing onto the queue will wake the
host’s executor. When the engine has steps to surface, Poll::Ready
returns the batch.
The host owns the executor. The framework holds no async runtime, no
thread pool, no socket. A test driver can call poll in a tight loop
with Waker::noop(); a production deployment can park the executor
on the waker the framework registers.
For a Module that records a non-empty bootstrap hook the host can
drive bootstrap to quiescence before transitioning to the normal
poll loop. The dedicated entry point on Node is run_bootstrap.
// from bytesandbrains/bb-runtime/src/node/mod.rs:780-831
pub fn run_bootstrap(
&mut self,
target: BootstrapTarget<'_>,
) -> Result<Vec<crate::engine::EngineStep>, crate::errors::BootstrapError> {
// ...
Ok(self.drain_bootstrap())
}
BootstrapTarget has four variants. BootstrapTarget::All arms and
drains every install-order bootstrap the Module recorded; this is
the common case after install. BootstrapTarget::ModuleNames and
BootstrapTarget::ModuleRequests pick out specific Module bootstrap
targets, optionally with staged inputs. BootstrapTarget::Slots
drives Component-level bootstraps by slot name. Each variant
validates atomically before any firing happens, so a host that mis-
typed a target name gets a typed error rather than a partial fire.
let mut steps = node.run_bootstrap(BootstrapTarget::All)?;
Idempotent on Nodes whose bootstrap already drained or that record no
bootstrap. The host calls it once after install and once after every
async-completion resumption that the bootstrap path waited on, then
switches to poll for normal operation.
Multi-process simulation: in-process bus
A multi-Node simulation in one process needs only one moving part
beyond what single-process needs: a router that takes each Node’s
outbound envelopes and pushes them onto the destination Node’s
IngressQueue. The framework ships an example helper that does
exactly this in fifty lines.
// from bytesandbrains/examples/common/mod.rs:160-200
pub struct Bus {
routes: HashMap<PeerId, Arc<IngressQueue>>,
}
impl Bus {
pub fn new() -> Self {
Self {
routes: HashMap::new(),
}
}
pub fn connect(&mut self, peer: PeerId, ingress: IngressQueueRef) {
self.routes.insert(peer, ingress.arc().clone());
}
pub fn forward(&self, src_peer: PeerId, steps: &[EngineStep]) -> usize {
let mut forwarded = 0;
for step in steps {
if let EngineStep::SendEnvelope(env) = step {
let peer = env
.dest_peer_addresses
.first()
.and_then(|bytes| Address::from_bytes(bytes).ok())
.and_then(|addr| addr.peer_id());
if let Some(peer) = peer {
if let Some(ingress) = self.routes.get(&peer) {
let _ = ingress.push(IngressEvent::from_in_process(
src_peer,
env.clone(),
));
forwarded += 1;
}
}
}
}
forwarded
}
}
IngressEvent::from_in_process is a constructor on IngressEvent
that builds the EnvelopeFrom variant with a sender-tagged
multiaddr filled in as the observed address; a real transport
adapter would supply whatever the network actually saw (a
NAT-translated remote endpoint, for instance) so the receiver can
merge it into its AddressBook for reflexive-address discovery.
Bus::connect records one IngressQueueRef per peer. Bus::forward
walks a Node’s emitted steps, picks the SendEnvelope ones, parses
the first destination address out of dest_peer_addresses, looks up
the matching ingress, and pushes an EnvelopeFrom. The push is
non-blocking; the lock-free MPMC queue accepts the event from any
thread.
The multi-target network example uses the bus to run a complete two-Node cohort end to end.
// from bytesandbrains/examples/multi_target_network.rs:201-279
let mut nodes: Vec<(String, Node, PeerId)> = Vec::new();
let target_names: Vec<String> = compiled.functions.iter().map(|f| f.name.clone()).collect();
for name in target_names {
let peer_u64 = assign(&name);
let peer = PeerId::from(peer_u64);
let addr = Address::empty().p2p(peer);
let mut node = bytesandbrains::install(
peer,
vec![addr],
compiled.clone(),
&[name.as_str()],
bytesandbrains::Config::new(),
)?;
node.add_peer(
PeerId::from(PEER_SOURCE),
vec![Address::empty().p2p(PeerId::from(PEER_SOURCE))],
)?;
node.add_peer(
PeerId::from(PEER_SINK),
vec![Address::empty().p2p(PeerId::from(PEER_SINK))],
)?;
nodes.push((name, node, peer));
}
let mut bus = Bus::new();
for (_, node, peer) in &nodes {
bus.connect(*peer, node.ingress_handle());
}
let waker = Waker::noop();
let mut envelopes_forwarded = 0;
for cycle in 0..40 {
let mut any_step = false;
for (_, node, peer) in &mut nodes {
let mut cx = Context::from_waker(waker);
let steps = match node.poll(&mut cx) {
std::task::Poll::Ready(s) => s,
std::task::Poll::Pending => continue,
};
envelopes_forwarded += bus.forward(*peer, &steps);
if !steps.is_empty() {
any_step = true;
}
}
if !any_step && cycle > 5 {
break;
}
}
Each loop iteration polls every Node, hands its emitted steps to the bus, and the bus delivers any envelopes to the destination peer’s ingress before the next iteration. The bus is the simulation’s transport. In a real deployment it is replaced by whatever messaging layer the host operates: a tokio task pumping a TCP socket, a libp2p swarm, a Kafka producer/consumer pair, a tower service.
Each Node’s add_peer call seeds its local AddressBook with the
peer ids and addresses it expects to send to. The wire syscall
consults the address book on every wire.Send. The bus simulation
does not consult the addresses; it routes by peer id directly. A
real transport will pull entries out of dest_peer_addresses and
pick one its capabilities can dial.
Driving inbound: deliver and invoke
The host has two seams onto a Node’s ingress queue. The first delivers raw envelope bytes from a transport.
// from bytesandbrains/bb-runtime/src/node/mod.rs:958-980
pub fn deliver_inbound(
&mut self,
src_peer: crate::ids::PeerId,
bytes: &[u8],
) -> Result<(), crate::errors::delivery::DeliveryError> {
let envelope =
crate::envelope::EnvelopeCodec::decode_capped(bytes, &self.config.envelope_caps)
.map_err(|e| {
crate::errors::delivery::DeliveryError::InvalidEnvelope(e.to_string())
})?;
self.engine
.ingress
.push(crate::ingress::IngressEvent::EnvelopeFrom {
src_peer,
envelope,
src_observed_address: None,
})
.map_err(|_| crate::errors::delivery::DeliveryError::IngressClosed)
}
deliver_inbound runs the decode through EnvelopeCodec::decode_capped
so malformed, schema-mismatched, or oversize buffers fail with a
typed DeliveryError::InvalidEnvelope before any prost allocation.
The transport adapter supplies src_peer so the engine can consult
PeerGovernor::check_inbound before any slot is written.
The second seam pushes an app event in by Module name.
// from bytesandbrains/bb-runtime/src/node/mod.rs:994-1010 (abridged)
pub fn deliver_event(
&mut self,
module: &str,
input: &str,
value_bytes: &[u8],
) -> Result<(), crate::errors::delivery::DeliveryError> {
if !self.module_index.contains_key(module) {
return Err(DeliveryError::UnknownModule(module.to_string()));
}
// The actual body caps value_bytes.len() against
// `NodeConfig::max_app_event_bytes`, charges the length
// against `NodeConfig::ingress_byte_budget`, fallibly
// reserves a fresh framework-owned Vec<u8>, and copies the
// caller's bytes in before pushing AppEvent.
// ...
Ok(())
}
A host application that wants to feed a value into a Module’s
g.input("name") port calls deliver_event with the Module name,
the input port name, and the serialized value as a borrowed slice.
The framework owns the copy: caps and budget gates run at the
boundary, so a host that submits an oversize payload gets a typed
DeliveryError::OversizePayload or DeliveryError::BudgetExceeded
back synchronously and a matching InfraEvent::AppIngressError on
the bus. The caller may free the slice the moment deliver_event
returns. A controller process that wants to invoke a target with
positional inputs reaches for Node::ingress_handle directly and
pushes an IngressEvent::Invoke.
// from bytesandbrains/examples/multi_target_network.rs:240-257
let sink_peers_value =
bytesandbrains::syscall::values::PeerIdVecValue(vec![PeerId::from(PEER_SINK)]);
let sink_peers_bytes = bincode::serialize(&sink_peers_value)?;
let source_name = nodes[source_idx].0.clone();
let _ = nodes[source_idx]
.1
.ingress_handle()
.push(IngressEvent::Invoke {
module_name: source_name,
inputs: vec![("sink_peers".into(), sink_peers_bytes)],
exec_id: bytesandbrains::ids::ExecId::from(0u64),
});
IngressQueueRef is cheap-clone and Send. The host can pass a
handle to a transport thread, a controller thread, or any other
producer; the engine pulls events off the queue at the start of each
poll cycle.
Snapshot and restore
The framework ships an in-memory NodeSnapshot surface. A host calls
Node::snapshot() to capture the snapshottable state, persists the
bytes, and later calls Node::restore on a fresh Node to resume.
// from bytesandbrains/bb-runtime/src/snapshot/mod.rs:24-43
pub struct NodeSnapshot {
pub incarnation: u64,
pub config: NodeConfigSnapshot,
pub graphs: Vec<NamedGraphSnapshot>,
pub components: Vec<NamedComponentSnapshot>,
pub transient: TransientSnapshot,
}
The transient field carries the runtime ephemeral state per
ENGINE.md §15.1: in-cycle DAG frontier, slot table, suspended Ops,
ingress events, plus the framework counters and typed-bus
subscriptions that survive the cycle boundary. NodeSnapshot owns
its TransientSnapshot for round-trip persistence; the same struct
is the bincode payload for the snapshot’s ephemeral half.
// from bytesandbrains/bb-runtime/src/snapshot/transient.rs:17-46
pub struct TransientSnapshot {
pub frontier: Vec<(u64, u64)>,
pub slot_table: HashMap<(u64, u64), Option<Vec<u8>>>,
pub pending_async: HashMap<u64, PendingAsyncSnapshot>,
pub execution_state: HashMap<u64, ExecutionStateSnapshot>,
pub framework: FrameworkSnapshot,
pub bus: TypedBusSnapshot,
pub ingress: Vec<IngressEventSnapshot>,
pub wire_states: HashMap<u32, Vec<u8>>,
pub pending_completions: Vec<PendingCompletionSnapshot>,
}
Two fields round-trip populated today: framework (counters,
lifecycle phases, address book, peer governor, backoff table,
pending outbound envelopes, multihash peer id bytes, ID counters)
and bus (typed-bus subscription table). The remaining fields
(frontier, slot_table, pending_async, execution_state,
ingress, wire_states, pending_completions) exist on the
struct so the shape matches the future in-flight execution
snapshot. Node::snapshot does not populate them yet; restored
Nodes start from a fresh frontier.
When to reach for TransientSnapshot directly rather than the full
NodeSnapshot: the type is exported from the facade as
bytesandbrains::TransientSnapshot for hosts that already own a
NodeSnapshot and want to inspect or rebuild only the ephemeral
half (a debugging tool walking the counters map, a test harness
asserting the address-book restore semantics in isolation). Hosts
persisting a Node round-trip the whole NodeSnapshot instead;
encoding a bare TransientSnapshot drops incarnation, the bound
graphs, the per-component state, and the NodeConfig round-trip,
so the resulting bytes are not enough to reconstruct a Node on the
other side. TransientSnapshot::default() is the owned-but-empty
starting point a synthetic snapshot builds from before being
attached to a NodeSnapshot for Node::restore.
Node::snapshot refuses to proceed when the in-Node bus still
carries un-drained events that a restore would silently drop or
re-fire stale. Callers drive Node::poll to quiescence first and
retry.
// from bytesandbrains/bb-runtime/src/node/mod.rs:176-187
pub fn snapshot(&self) -> Result<crate::snapshot::NodeSnapshot, crate::errors::SnapshotError> {
let queued = self.engine.bus.len();
if queued > 0 {
return Err(crate::errors::SnapshotError::BusNotDrained { queued, dropped: 0 });
}
Ok(self.snapshot_inner())
}
The snapshot encodes through bincode for on-disk persistence or inter-process transfer.
// from bytesandbrains/bb-runtime/src/snapshot/mod.rs:99-111
impl NodeSnapshot {
pub fn encode(&self) -> Vec<u8> {
bincode::serialize(self).expect("NodeSnapshot serde is infallible for valid types")
}
pub fn decode(bytes: &[u8]) -> Result<Self, bincode::Error> {
bincode::deserialize(bytes)
}
}
Restoring is a method on Node. A host constructs a fresh Node via
install against the same compiled model the snapshot came from,
then calls restore. The incarnation field on the snapshot is
bumped so the host can detect “is this the same Node since N seconds
ago?” without inspecting any other state.
// from bytesandbrains/tests/node_lifecycle.rs:72-95
let snap = node.snapshot().expect("snapshot");
let mut node2 = install(
PeerId::from(1u64),
vec![Address::empty()],
compiled2,
&[target2.as_str()],
Config::new(),
)
.expect("install2");
node2.restore(snap).expect("restore");
assert_eq!(node.incarnation(), 0);
assert_eq!(node2.incarnation(), 1);
The restored Node’s engine carries the same graphs, the same
component state, the same address book entries, the same pending
async commands, and the same outbound queue as the captured one.
Components whose ConcreteComponent::serialize and restore impls
are no-ops still survive restore; their slot bindings and dispatch
table entries are reconstructed from the binding metadata the
compiled model carries.
Long-running federated training needs a durable Node state story;
the current NodeSnapshot is the type-level surface that feeds it.
The on-the-wire proto-encoded transfer shape and the cross-version
compatibility story are not part of this release and live outside a
follow-up release. The in-memory Rust surface is sufficient for the
local persistence and inter-process transfer cases this release
ships.
Wire-type compatibility across binaries
The framework’s SlotValue blanket is the wire-eligibility contract.
Any type satisfying Clone + Serialize + DeserializeOwned + Send + Sync + 'static rides the wire by construction. There is no
registration step. The shared crate declares any custom message type
once with serde derives on it; both binaries pick up the
serde-compatible encoding automatically.
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct ModelDelta {
pub round: u32,
pub weights: Vec<f32>,
}
Wire encoding is bincode-derived, so two binaries built from the
same shared crate always agree on the byte layout. Two binaries
built against different versions of the shared crate do not. The
deployment story for cross-version migration is the same as any
other serde-versioned protocol: bump a #[serde(rename = "...")]
or add a default to handle the legacy field, redeploy.
The libp2p adapter shape
A bb-libp2p crate that adapts Address to libp2p multiaddrs and
the IngressQueue to libp2p messaging is not part of this release.
The integration shape below is what any such adapter plugs into:
the same Node::poll and Node::ingress_handle seams the in-process
bus example uses.
The shape is straightforward. A libp2p swarm task takes each Node’s
EngineStep::SendEnvelope, picks an entry from
envelope.dest_peer_addresses whose multiaddr the swarm can dial,
encodes the envelope via EnvelopeCodec::encode, and ships the
bytes. The inbound side of the swarm pulls bytes off a libp2p
stream, calls Node::deliver_inbound with the source peer id and
the byte slice, and the engine decodes through the bounded
EnvelopeCodec::decode_capped path before any payload reaches a
slot.
Address already speaks libp2p’s /p2p/ segment using libp2p’s
standard varint code 421, so a peer id minted by libp2p round-trips
through the address codec byte-for-byte. The other three address
variants (/site/, /component/, /op/) use framework-internal
codes and would not appear on the outer libp2p layer; they only
ride inside SlotFill::dest_suffix to drive the receiver’s
dispatch decision.
An HTTP wrapper (bb-http) for the simpler “post one peer’s
gradient to a known endpoint” pattern, and deployment examples for
a GCP three-node cluster, a ROS 2 robot swarm, and an Android
client, do not exist in this release. The seams they plug into do.
Cross-language deployment
Form 2’s artifact is a prost-encoded ModelProto. The proto schema
is widely supported. A Go or Python runtime that registers
equivalent Component impls under matching TYPE_NAMEs could load
the same artifact and install its target functions. The framework
does not ship cross-language runtimes in this release; the design
admits them.
The two metadata stamps drive install: the compilation passport
keyed at ai.bytesandbrains.compiled and one entry per
(target, slot) pair keyed at
ai.bytesandbrains.binding.<target>.<slot>. Both are plain
metadata entries on the ModelProto. A non-Rust runtime needs to
re-implement the install path: passport check, target lookup,
binding-table parse, per-slot construction by TYPE_NAME,
function-table install. The wire format the runtime emits and
accepts is the same WireEnvelope proto Chapter 11 walked.
Where this lives
The canonical sources in the bytesandbrains repo:
bytesandbrains/src/install.rs:bb::install,Config,InstallError, the passport check, the binding-table parse, and the per-slot inventory construction.bytesandbrains/bb-compiler/src/driver.rs:Compiler::new, thebind_<role>::<T>chain, andCompiler::compilereturning the passport-stampedModelProto.bytesandbrains/bb-runtime/src/node/mod.rs:Node::poll,Node::run_bootstrap,Node::deliver_inbound,Node::deliver_event,Node::ingress_handle,Node::add_peer,Node::snapshot,Node::restore,Node::with_config.bytesandbrains/bb-runtime/src/snapshot/mod.rs:NodeSnapshot,NodeConfigSnapshot,NamedGraphSnapshot,NamedComponentSnapshot,NodeSnapshot::encode,NodeSnapshot::decode.bytesandbrains/bb-runtime/src/ingress.rs:IngressEvent,IngressQueue,IngressQueueRef.bytesandbrains/examples/federated_learning.rs: end-to-end install + per-Node poll loop for the federated topology.bytesandbrains/examples/multi_target_network.rs: two-Node in-process cohort driven by the exampleBus.bytesandbrains/examples/common/mod.rs: the exampleBushelper plusdrive_poll.bytesandbrains/tests/install_passport.rs: passport-failure and prost round-trip pinning.bytesandbrains/tests/node_lifecycle.rs: snapshot + restore round-trip pinning.bytesandbrains/docs/DEPLOYMENT.md: the bb-private architecture spec.