Types and Storage

Chapter 7 walked the dependency mechanism that lets one bound concrete reach another at runtime. This chapter turns to the data those Contracts exchange. The framework runs on a hierarchical type lattice resolved at compile time. Authors never spell a tensor’s element type in a Module body. The compiler walks each port’s connected ops, collects the declared type relations, and resolves every value name to a concrete leaf in the lattice. The runtime never sees an abstract type. By the time an op dispatches, the type work is done and the engine reads a single stamped reference.

The whole surface is one trait, one type tree, one solver pass, and a short list of metadata stamps the compiler writes back onto the IR. The sections below walk those pieces in order.

The `Storage` trait

Storage is the static link between a Rust storage type and its position in the lattice. Library makers declare where a backend, index, model, or any other Component’s native storage sits by picking the Storage impl that matches the associated type on their Contract.

// from bytesandbrains/bb-ir/src/types/storage.rs:21-26
pub trait Storage: Send + Sync + 'static {
    /// Position-in-tree declaration. The `TypeNode` static this
    /// constant points at decides what other storage types unify with
    /// this one during the type-solver walk.
    const TYPE: &'static TypeNode;
}

Every Contract that produces or consumes tensors carries a Tensor associated type bound by Storage. A backend that runs ndarray’s f32 dense layout writes type Tensor = CpuTensor; and inherits the framework’s blanket impls that map a Rust storage type to one of the built-in TypeNode statics.

The framework ships blanket impls for every primitive storage shape it recognizes. Slices land at the corresponding Tensor<T> leaf:

// from bytesandbrains/bb-ir/src/types/storage.rs:30-73
impl Storage for [f32] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_F32;
}
impl Storage for [f64] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_F64;
}
impl Storage for [half::f16] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_F16;
}
impl Storage for [half::bf16] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_BF16;
}
impl Storage for [u8] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_U8;
}
impl Storage for [u16] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_U16;
}
impl Storage for [u32] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_U32;
}
impl Storage for [u64] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_U64;
}
impl Storage for [i8] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_I8;
}
impl Storage for [i16] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_I16;
}
impl Storage for [i32] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_I32;
}
impl Storage for [i64] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_I64;
}
impl Storage for [bool] {
    const TYPE: &'static TypeNode = &TYPE_TENSOR_BOOL;
}

That covers thirteen tensor leaves. The same file also lists five scalar leaves (f32, f64, u16 for half, u8, i32) and one generic-position storage type covered in the next section.

The trait is Send + Sync + 'static because Storage values cross threads at dispatch time and must outlive any borrow the engine holds. The associated type on each Contract is ?Sized + Storage, so unsized slice types like [f32] work directly. Owned variants like Vec<f32> and Box<[f32]> plug into the same leaves through deref-coercion at the call site.

`AnyTensor` and the generic position

A backend that wants to dispatch over multiple concrete element types inside one impl, or an Index that delegates distance math to whichever Backend the slot binds, needs a storage type that sits ABOVE the concrete leaves. The framework’s answer is AnyTensor, a concrete-erased tensor that holds raw bytes plus a runtime-known Dtype and shape.

// from bytesandbrains/bb-ir/src/types/storage.rs:104-117
#[derive(Clone, Debug)]
pub struct AnyTensor {
    /// Raw little-endian bytes of the tensor payload, packed per `dtype`.
    pub bytes: Vec<u8>,
    /// Runtime dtype. Pair with `Self::shape` for full interpretation.
    pub dtype: Dtype,
    /// Per-axis shape. `shape.iter().product::<usize>()` × dtype-size
    /// is expected to equal `bytes.len()`.
    pub shape: Vec<usize>,
}

impl Storage for AnyTensor {
    const TYPE: &'static TypeNode = &TYPE_TENSOR;
}

AnyTensor::TYPE points at TYPE_TENSOR, the abstract interior bound that matches every Tensor<T> leaf. A Component declaring type Tensor = AnyTensor accepts any concrete tensor element type at the type-solver level, because the lattice walk treats the leaf as a subtype of the abstract parent.

The runtime tag for switching on the element type is Dtype:

// from bytesandbrains/bb-ir/src/types/storage.rs:122-149
pub enum Dtype {
    F32,
    F64,
    F16,
    BF16,
    U8,
    U16,
    U32,
    U64,
    I8,
    I16,
    I32,
    I64,
    Bool,
}

Dtype::type_node() maps each variant to the matching TypeNode static so a Component can dispatch on the runtime dtype without hard-coding string identifiers.

See Chapter 6 (Roles and Contracts) for the concrete shape of impl Backend for CpuBackend { type Tensor = CpuTensor; ... } and how an associated-type binding wires a Backend Contract to its lattice slot.

The type tree

Every type the framework speaks has a &'static TypeNode. The nodes form an open tree rooted at TYPE_ANY. The interior nodes are abstract; the leaves are concrete. Built-in nodes live in bb_ir::types::builtins and submit themselves to the inventory at startup:

Any
├── Tensor                     (abstract)
│   ├── Tensor<F32>             (concrete, 13 leaves total)
│   ├── Tensor<F64>
│   ├── Tensor<F16>
│   ├── Tensor<BF16>
│   ├── Tensor<U8>
│   ├── Tensor<U16>
│   ├── Tensor<U32>
│   ├── Tensor<U64>
│   ├── Tensor<I8>
│   ├── Tensor<I16>
│   ├── Tensor<I32>
│   ├── Tensor<I64>
│   └── Tensor<Bool>
├── Scalar                     (abstract)
│   ├── Scalar<F32>             (5 leaves)
│   ├── Scalar<F64>
│   ├── Scalar<F16>
│   ├── Scalar<U8>
│   └── Scalar<I32>
├── PeerId                     (concrete)
├── PeerIdVec                  (concrete)
├── Trigger                    (concrete)
├── Bytes                      (concrete)
├── WireReqId                  (concrete)
├── Multiaddress               (concrete)
├── AddressVec                 (concrete)
└── Composite                  (concrete)

Each node carries enough metadata for the lattice walk, FFI bindings, the wire-envelope discriminator, and the canonical denotation the recorder stamps onto ValueInfoProto.type.denotation:

// from bytesandbrains/bb-ir/src/types/mod.rs:36-52
pub struct TypeNode {
    /// Dotted-namespace identifier (`"any"`, `"tensor.f32"`).
    /// Must be unique across submissions.
    pub id: &'static str,
    /// Parent's id, or `None` for the root.
    pub parent: Option<&'static str>,
    /// Dispatchable leaf vs. abstract bound.
    pub kind: TypeKind,
    /// C-FFI struct name (cbindgen). Empty for abstract nodes.
    pub ffi_name: &'static str,
    /// Wire-envelope discriminator. Concrete leaves non-zero;
    /// abstract nodes 0.
    pub wire_hash: u64,
    /// ONNX denotation stamped on `ValueInfoProto.type.denotation`.
    /// Empty when no canonical denotation exists.
    pub denotation: &'static str,
}

TypeKind::Concrete marks dispatchable leaves. TypeKind::Abstract marks interior bounds. A value’s runtime type always points at a concrete leaf; abstract nodes are valid only as port bounds and relation participants.

The tree is open through inventory. A downstream backend can register a new leaf without touching the framework crates:

inventory::submit! {
    TypeNodeReg(&TypeNode {
        id: "tensor.bf16",
        parent: Some("tensor"),
        kind: TypeKind::Concrete,
        ffi_name: "bb_tensor_bf16_t",
        wire_hash: 0x0000_0000_0000_0107,
        denotation: "ai.bytesandbrains.tensor.bf16",
    })
}

The lattice builds itself once at process startup from every TypeNodeReg submission. Subtype queries (child.is_subtype_of(parent)) walk the parent chain through a memoized cache:

// from bytesandbrains/bb-ir/src/types/mod.rs:68-84
impl TypeNode {
    /// `true` if `self` is `other` or a descendant. Cached on
    /// the global [`Lattice`].
    pub fn is_subtype_of(&'static self, other: &'static TypeNode) -> bool {
        Lattice::get().is_subtype(self, other)
    }

    /// `true` iff this node is a concrete (dispatchable) leaf.
    pub fn is_concrete(&self) -> bool {
        matches!(self.kind, TypeKind::Concrete)
    }

    /// `true` iff this node is an abstract interior bound.
    pub fn is_abstract(&self) -> bool {
        matches!(self.kind, TypeKind::Abstract)
    }
}

How the type solver works

Module ports declare names and direction only. The author never writes Tensor<F32> on an input. The compiler’s TypeSolver infers each port’s effective type by walking the body’s ops, collecting their declared TypeRelations, and propagating constraints to fixpoint.

Each op declaration carries a type_relations slice of constraint predicates. The full enum:

// from bytesandbrains/bb-ir/src/types/relations.rs:50-110
pub enum TypeRelation {
    /// All listed ports share the SAME concrete TypeNode.
    SameType(&'static [PortRef]),

    /// All listed Tensor-typed ports share the same ELEMENT type.
    /// Shapes may differ (broadcasting is a separate concern).
    /// `Add(x: Tensor, y: Tensor) -> Tensor` uses this.
    SameElementType(&'static [PortRef]),

    /// The output is the broadcast of two tensor inputs.
    BroadcastShape {
        in0: PortRef,
        in1: PortRef,
        out: PortRef,
    },

    /// Output preserves the input's TypeNode entirely. Used by
    /// element-wise unary ops (`Sqrt`, `Neg`, `Abs`, `Relu`, etc.).
    Elementwise {
        input: PortRef,
        output: PortRef,
    },

    /// Output is a reduction over the input: same element type,
    /// reduced shape.
    ReduceOver {
        input: PortRef,
        output: PortRef,
    },

    /// Escape hatch for ops that don't fit a predicate. Used for
    /// `Reshape`, `Gather`, `Concat`, `Cast`, and any op with
    /// attribute-driven type changes.
    Custom {
        name: &'static str,
        run: fn(&CustomRelationCtx<'_>) -> RelationResult,
    },
}

The solver runs a bipartite worklist patterned on TVM Relay’s type_solver.h. The algorithm has four phases:

Allocate a type-variable slot per value position in the graph (function inputs, op outputs, op inputs), seeded with the bound denotation (TYPE_ANY if unbound).
Instantiate a relation node per declared TypeRelation on each NodeProto, cross-linked with rel_set back-edges to its participating slots.
Drain the worklist. Pop a relation, run it. RelationResult::Refined requeues every dependent of the participating slots. Satisfied removes the relation. Defer parks it until something else refines a slot it touches. Failed aborts with a typed error.
Post-condition. In strict mode every slot must resolve to a concrete leaf. Any unresolved slot surfaces as TypeError::UnresolvedType. Permissive mode lets unresolved slots pass through as TYPE_ANY.

The polymorphic_types example walks the solver end to end. The key snippet seeds the input x with TYPE_TENSOR_F32 and watches the constraint network propagate it through Add and Relu:

// from bytesandbrains/examples/polymorphic_types.rs:108-146
let graph = GraphProto {
    input: vec![value_info("x"), value_info("y")],
    node: vec![
        // z = Add(x, y)    - all share an element type
        NodeProto {
            op_type: "Add".into(),
            domain: "ai.onnx".into(),
            input: vec!["x".into(), "y".into()],
            output: vec!["z".into()],
            ..Default::default()
        },
        // w = Relu(z)       - elementwise; output mirrors input
        NodeProto {
            op_type: "Relu".into(),
            domain: "ai.onnx".into(),
            input: vec!["z".into()],
            output: vec!["w".into()],
            ..Default::default()
        },
    ],
    ..Default::default()
};
let decl_for_op = |domain: &str, op_type: &str| -> Option<&'static AtomicOpDecl> {
    match (domain, op_type) {
        ("ai.onnx", "Add") => Some(&ADD_DECL),
        ("ai.onnx", "Relu") => Some(&RELU_DECL),
        _ => None,
    }
};
let mut solver = TypeSolver::from_graph(&graph, decl_for_op)?;
solver.seed("x", &TYPE_TENSOR_F32);
let solution = solver.solve()?;
println!("  seed: x = tensor.f32");
for value in ["x", "y", "z", "w"] {
    let t = solution.type_of(value).expect("solver resolved");
    println!("  resolved: {value} → {}", t.id);
}

Add declares SameElementType([in0, in1, out0]). That single relation forces y and z to the same leaf as x. Relu declares Elementwise { input: in0, output: out0 }. That relation copies z’s resolution onto w. The solver reports x, y, z, and w all resolved to tensor.f32 from one seeded input.

Conflicting seeds raise a typed error. Seeding x to TYPE_TENSOR_F32 and y to TYPE_TENSOR_F64 violates SameElementType on the Add relation. The solver returns TypeError::ConstraintFailed:

// from bytesandbrains/examples/polymorphic_types.rs:149-156
let mut solver = TypeSolver::from_graph(&graph, decl_for_op)?;
solver.seed("x", &TYPE_TENSOR_F32);
solver.seed("y", &TYPE_TENSOR_F64);
match solver.solve() {
    Err(e) => println!("  ✓ solver rejected mixed F32/F64: {e}"),
    Ok(_) => println!("  ✗ solver should have rejected"),
}

The solver runs strictly by default. The Compiler::compile() path invokes solve_strict() so every value site must resolve to a concrete leaf. Bespoke pipelines that hand-author NodeProtos without declaring value_info for every value can drop into permissive mode through the builder:

// builder usage; `with_permissive_types` is defined at
// bytesandbrains/bb-compiler/src/driver.rs:154
let compiled = bb::Compiler::new()
    .with_permissive_types()
    .bind_backend::<CpuBackend>("compute")
    .compile(model)?;

Permissive mode lets unresolved slots pass through as TYPE_ANY. The ai.bytesandbrains.opaque denotation maps to the concrete bytes leaf through lookup_denotation, so opaque-payload graphs typecheck even in strict mode. Only genuinely under-declared values surface as UnresolvedType.

What gets stamped where

The TypeSolver does not run in isolation. The compiler routes information into and out of the IR through three stamps that ride on the ValueInfoProto.type.denotation field and on per-NodeProto metadata.

The first stamp is the recorder’s. Graph::input(name) writes a ValueInfoProto carrying the TYPE_BYTES placeholder, whose canonical denotation is "ai.bytesandbrains.opaque". Every syscall and DSL helper that produces a typed output stamps its Storage::TYPE denotation onto the function’s value_info list. The canonical pipeline reads those denotations to seed the solver before the worklist runs:

// from bytesandbrains/bb-compiler/src/type_solver.rs:284-297
pub fn seed_from_value_info(&mut self, graph: &GraphProto) {
    for vi in graph.input.iter().chain(graph.value_info.iter()) {
        let Some(type_proto) = vi.r#type.as_ref() else {
            continue;
        };
        let denotation = type_proto.denotation.as_str();
        if denotation.is_empty() {
            continue;
        }
        if let Some(node) = bb_ir::types::builtins::lookup_denotation(denotation) {
            self.seed(&vi.name, node);
        }
    }
}

The second stamp comes from refine_polymorphic_value_info, the pass that runs BEFORE the canonical pipeline. It walks every Contract-method NodeProto carrying both ai.bytesandbrains.required_trait and ai.bytesandbrains.slot_id metadata, looks up the bound concrete’s Storage::TYPE through the binding spec, and rewrites the polymorphic TYPE_TENSOR placeholder denotation to the resolved leaf. By the time the solver runs, every Contract-method output already carries its concrete denotation.

The third stamp comes from the solver itself. apply_solution_to_value_info walks the resolved TypeSolution and writes each narrowed leaf’s denotation back onto the matching ValueInfoProto.type.denotation. Downstream passes and the runtime read the narrowed denotations instead of the recorder’s placeholders:

// from bytesandbrains/bb-compiler/src/type_solver.rs:360-376
pub fn apply_solution_to_value_info(graph: &mut GraphProto, solution: &TypeSolution) {
    for vi in graph.input.iter_mut().chain(graph.value_info.iter_mut()) {
        let Some(node) = solution.type_of(&vi.name) else {
            continue;
        };
        if node.is_abstract() {
            continue;
        }
        let denotation = type_node_to_denotation(node);
        if denotation.is_empty() {
            continue;
        }
        if let Some(type_proto) = vi.r#type.as_mut() {
            type_proto.denotation = denotation.to_string();
        }
    }
}

The result is one IR where every value name carries the same canonical denotation that Dtype::type_node() returns. The runtime never re-runs the solver. The engine reads the stamped denotation at op invocation time and dispatches against the bound concrete’s Storage::TYPE.

A worked walk of the lattice, the runtime-type tags, the AtomicOpDecl relations, the solver propagation, and the compiler’s stamp metadata lives in the polymorphic_types example. Run it to see every surface print to the console:

cargo run --example polymorphic_types

The bind chain at a glance

Every Storage-bound concrete reaches the compiler through one of seven Compiler::bind_<role>::<T>(slot) builder methods. Each method records a binding whose T must implement both ConcreteComponent and the role’s Contract trait:

Role	Builder method	What `T` must implement
Backend	`bind_backend::<T>`	`contracts::Backend`
Index	`bind_index::<T>`	`contracts::Index`
Model	`bind_model::<T>`	`contracts::Model`
Aggregator	`bind_aggregator::<T>`	`contracts::Aggregator`
Codec	`bind_codec::<T>`	`contracts::Codec`
DataSource	`bind_data_source::<T>`	`contracts::DataSource`
PeerSelector	`bind_peer_selector::<T>`	`contracts::PeerSelector`

Protocol slots are bound through register_protocol! at compile time rather than bind_protocol; see chapter 6 (Roles) and chapter 9 (The Compiler) for the bind-chain semantics.

Where this lives

The Storage trait, blanket impls, AnyTensor, and Dtype: bytesandbrains/bb-ir/src/types/storage.rs.
The TypeNode struct, TypeKind, and the inventory carrier TypeNodeReg: bytesandbrains/bb-ir/src/types/mod.rs.
The thirteen tensor leaves, five scalar leaves, and the framework primitives (PeerId, PeerIdVec, Trigger, Bytes, WireReqId, Multiaddress, AddressVec, Composite): bytesandbrains/bb-ir/src/types/builtins.rs.
The Lattice and its memoized subtype queries: bytesandbrains/bb-ir/src/types/lattice.rs.
The TypeRelation enum and its solver hooks: bytesandbrains/bb-ir/src/types/relations.rs.
The TypeSolver, TypeSolution, and TypeError: bytesandbrains/bb-compiler/src/type_solver.rs.
The placeholder-refinement pass that runs before the solver: bytesandbrains/bb-compiler/src/refine_polymorphic_value_info.rs.
The compiler’s permissive-mode entry point Compiler::with_permissive_types: bytesandbrains/bb-compiler/src/driver.rs.
The end-to-end demo that exercises every surface in this chapter: bytesandbrains/examples/polymorphic_types.rs.

The Storage trait

AnyTensor and the generic position