Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Project Overview

Aura is a fully peer-to-peer, private communication system that operates without dedicated servers. It uses a web-of-trust architecture to provide discovery, data availability, account recovery, and graceful async protocol evolution.

Threshold signatures are used to abstract keys from devices. The network topology between logical users reflects social relationships, forming encrypted mesh that provides discovery, availability, and recovery. State converges through CRDT journals without central coordination, while session-typed choreographic protocols ensure safe multi-party execution.

How Aura Works

In Aura, all actors are authorities. An authority is an opaque cryptographic actor that may represent a person, a device group, or a shared context. External observers see only public keys and signed operations. This enables unlinkable participation across contexts.

State is append-only facts in journals. Each authority maintains its own journal. Shared contexts have journals written by multiple participants. Facts accumulate through CRDT merge and views are derived by reduction.

Side effects flow through explicit traits. Cryptography, storage, networking, and time are accessed only through effect handlers. This enables deterministic simulation and cross-platform portability.

Multi-party coordination uses session-typed choreographies. A global protocol specifies message flow. Each party's local behavior is projected from the global view.

Authorization passes through a layered guard chain. Before any message leaves, capabilities are verified, flow budgets are charged, and facts are committed atomically.

Aura separates key generation from agreement. Fast paths provide immediate usability while durable shared state is always consensus-finalized.

For the complete architecture, see System Architecture.

Documentation Index

The documents below cover theory, technical components, implementation guidance, and project organization.

1. Foundation

Theoretical Model establishes the formal calculus, algebraic types, and semilattice semantics underlying the system.

System Architecture describes the 8-layer architecture, effect patterns, and choreographic protocol structure.

Privacy and Information Flow Contract specifies consent-based privacy with trust boundaries, flow budgets, and leakage tracking.

Distributed Systems Contract defines safety and liveness guarantees, synchrony assumptions, and adversarial tolerance.

2. Core Systems

Cryptography documents primitives, key derivation, threshold signatures, and VSS schemes.

Identifiers and Boundaries defines the identifier types and their privacy-preserving properties.

Authority and Identity describes opaque authorities, commitment trees, and relational context structure.

Journal specifies fact-based journals, validation rules, and deterministic reduction flows.

Authorization covers capability semantics, Biscuit token integration, and guard chain authorization.

Effect System documents effect traits, handler design, and context propagation.

Runtime describes lifecycle management, guard chain execution, and service composition.

User Flow Harness defines the shared UX contract, authoritative observation rules, and browser freshness model for parity-critical harness flows.

Ownership Model defines the four ownership categories, capability-gated authority, terminality rules, and the workspace crate inventory.

Consensus specifies single-shot agreement for non-monotone operations with witness attestation.

Operation Categories defines A/B/C operation tiers, K1/K2/K3 key generation, and agreement levels.

MPST and Choreography covers multi-party session types and choreographic protocol projection.

Transport and Information Flow specifies guard chain enforcement, secure channels, and flow receipts.

Aura Messaging Protocol (AMP) documents reliable async messaging with acknowledgment and ordering patterns.

Network Anonymity documents anonymous path routing, bootstrap re-entry records, and reply-block semantics.

Rendezvous Architecture covers context-scoped peer discovery and encrypted envelope exchange.

Relational Contexts specifies guardian bindings, recovery grants, and cross-authority journals.

Database Architecture defines the query layer using journals, Biscuit predicates, and CRDT views.

Social Architecture describes the three-tier model of messages, homes, and neighborhoods.

Distributed Maintenance Architecture covers snapshots, garbage collection, and system evolution.

CLI and Terminal User Interface specifies command-line and TUI interfaces for Aura operations.

Test Infrastructure Reference documents test fixtures, mock handlers, and scenario builders.

Simulation Infrastructure Reference covers deterministic simulation with virtual time and fault injection.

Formal Verification Reference describes Quint model checking and Lean theorem proving integration.

3. Developer Guides

Getting Started Guide provides a starting point for developers new to the codebase.

Effects and Handlers Guide covers the algebraic effect system, handler implementation, and platform support.

Choreography Development Guide explains choreographic protocol design, CRDTs, and distributed coordination.

Testing Guide covers test patterns, fixtures, conformance testing, and runtime harness.

User Flow Harness explains the determinism model for shared-flow harness execution, revisioned observation, and trace conformance.

Simulation Guide explains deterministic simulation for debugging and property verification.

Verification Guide documents Quint model checking and Lean proof workflows.

System Internals Guide covers guard chain internals, service patterns, and reactive scheduling.

Distributed Maintenance Guide covers operational concerns including snapshots and system upgrades.

Capability Vocabulary Inventory inventories authorization capability strings and their migration targets.

4. Project Meta

UX Flow Coverage Report tracks harness and scenario coverage for user-visible flows.

Verification Coverage Report tracks formal verification status across Quint specs and Lean proofs.

Project Structure documents the 8-layer crate architecture and dependency relationships.

System Architecture

This document gives an intuitive overview of Aura's architecture, covering core abstractions, information flow and component interactions. Formal definitions live in Theoretical Model. Crate organization are documented in Project Structure.

Overview

Aura distributes identity and trust across devices and social relationships to enable private peer-to-peer communication.

The system is designed to operate without dedicated servers. Discovery, availability, and recovery are provided by the web of trust. Peers relay messages for one another based on social proximity. Without centralized routing, no single party can observe all traffic or deny service.

flowchart TB
    subgraph Authorities
        A1[Authority A]
        A2[Authority B]
    end

    subgraph State
        J1[Journal A]
        J2[Journal B]
        JC[Context Journal]
    end

    subgraph Enforcement
        GC[Guard Chain]
        FB[Flow Budget]
    end

    subgraph Effects
        EF[Effect System]
        TR[Transport]
    end

    A1 --> J1
    A2 --> J2
    A1 & A2 --> JC
    J1 & J2 & JC --> GC
    GC --> FB
    FB --> EF
    EF --> TR

This diagram shows the primary data flow. Authorities own journals that store facts. The guard chain enforces authorization before any transport effect. The effect system provides the abstraction layer for all operations.

Every operation flows through the effect system. Every state change is replicated through journals. Every external action is authorized through guards. These three invariants define the architectural contract.

1. State Model

1.1 Dual semilattice

Aura state consists of two complementary semilattices. Facts form a join-semilattice where information accumulates through the join operation. Capabilities form a meet-semilattice where authority restricts through the meet operation.

#![allow(unused)]
fn main() {
struct Journal {
    facts: FactSet,        // join-semilattice (⊔)
    frontier: CapFrontier, // meet-semilattice (⊓)
}
}

The Journal type keeps these dimensions separate. Facts can only grow. Capabilities can only shrink. This dual monotonicity provides convergence guarantees for replicated state.

Facts represent evidence that accumulates over time. Examples include signed operations, attestations, flow budget charges, and consensus commits. Once a fact is added, it cannot be removed. Garbage collection uses tombstones and reduction rather than deletion.

Capabilities represent authority that restricts over time. First-party capability vocabulary is declared in typed families owned by the crates that define the behavior. The system evaluates Biscuit tokens against policy to derive the current capability frontier. Delegation can only attenuate. No operation can widen capability scope. Token issuance is explicit, and guard snapshots carry evaluated frontiers rather than declared capability families. See Theoretical Model for formal definitions of these lattices.

1.2 Journals and namespaces

The journal is the canonical state mechanism. All durable state is represented as facts in journals. Views are derived by reducing accumulated facts.

Journals are partitioned into namespaces. Authority namespaces store facts owned by a single authority. Context namespaces store facts shared across authorities participating in a relational context. Facts in one namespace cannot reference or affect facts in another namespace. Cross-namespace coordination requires explicit protocols.

Facts are content-addressed immutable records. Each fact includes a type identifier, payload, attestation, and metadata. Facts accumulate through CRDT merge. Duplicate facts are deduplicated by content hash. Conflicting facts are resolved by type-specific merge rules.

Attestations prove that an authority endorsed the fact. Threshold signatures require multiple devices to attest. Single-device signatures are used for local facts. See Journal for the complete specification.

1.3 State reduction

State reduction computes views from accumulated facts. Reducers are pure functions that transform fact sets into derived state. Reduction is deterministic and reproducible.

Reduction runs on demand or is cached for performance. Cached views are invalidated when new facts arrive. The reduction pipeline supports incremental updates for large fact sets. See Journal for the reduction architecture.

1.4 Content addressing

All Aura artifacts are identified by the hash of their DAG-CBOR canonical encoding. Published digests are immutable. Journal merges and payload downloads verify digests before accepting data. See Theoretical Model for the content addressing contract.

2. Identity and Trust

2.1 Authorities

An authority is an opaque cryptographic actor. External parties see only public keys and signed facts. Internal device structure is hidden. This abstraction provides unlinkability across contexts.

flowchart LR
    subgraph External View
        PK[Threshold Public Key]
    end

    subgraph Authority [Internal Structure]
        direction TB
        CT[Commitment Tree]
        subgraph Devices
            direction LR
            D1[Device 1<br/>Share 1]
            D2[Device 2<br/>Share 2]
            D3[Device 3<br/>Share 3]
        end
        CT --> Devices
    end

    Devices -.->|2-of-3 signing| PK

    style Authority fill:transparent,stroke:#888,stroke-dasharray: 5 5

Account authorities maintain device membership using commitment trees. The journal stores signed tree operations as facts. Reduction reconstructs the canonical tree state from accumulated facts. FROST provides the threshold signature scheme. DKG distributes key shares without a trusted dealer. Key rotation and resharing maintain security as devices join or leave. See Cryptography for threshold details.

2.2 Relational contexts

Relational contexts are shared journals for cross-authority state. Each context has its own namespace and does not reveal participants to external observers. Participation is expressed by writing relational facts. Profile data, nicknames, and relationship state live in context journals. See Authority and Identity for commitment tree details and Relational Contexts for context patterns.

2.3 Contextual identity

Identity is scoped to contexts. A device can participate in many contexts without linking them. Each context derives independent keys through deterministic key derivation. This prevents cross-context correlation by external observers. See Identifiers and Boundaries for identifier semantics.

2.4 Social topology

Aura organizes social structure into three tiers. Messages are communication contexts for direct and group conversations. Homes are semi-public communities capped by storage constraints. Neighborhoods are collections of homes connected via 1-hop links.

The social topology shapes routing, relay selection, and governance. Authorities prefer relays operated by trusted peers within their home or neighborhood. Storage allocation is bounded per home, producing natural scarcity that scales with social investment. Local governance is encoded as capability-gated policy facts within each home's journal.

Access levels to a home follow the topology. Full access applies within the home. Partial access applies across 1-hop neighborhood links. Limited access applies at greater distances. See Social Architecture for the complete model.

3. Effects and Time

3.1 Effect system

Effect traits define async capabilities with explicit context. Handlers implement these traits for specific environments. The effect system provides the abstraction layer between application logic and runtime behavior.

flowchart TB
    subgraph L3["Composite"]
        direction LR
        TRE[TreeEffects] ~~~ CHE[ChoreographyExt]
    end

    subgraph L2["Application"]
        direction LR
        JE[JournalEffects] ~~~ AE[AuthorizationEffects] ~~~ FE[FlowBudgetEffects] ~~~ LE[LeakageEffects]
    end

    subgraph L1["Infrastructure"]
        direction LR
        CE[CryptoEffects] ~~~ NE[NetworkEffects] ~~~ SE[StorageEffects] ~~~ TE[TimeEffects] ~~~ RE[RandomEffects]
    end

    L1 --> L2 --> L3

Infrastructure effects wrap OS primitives including cryptography, networking, storage, time, and randomness. Application effects encode domain logic including journal operations, authorization, flow budgets, and leakage tracking. Composite effects combine lower layers for commitment tree operations and choreography execution.

Application code must not call system time, randomness, or IO directly. These operations must flow through effect traits. This constraint enables deterministic testing and simulation. See Effect System for the full specification.

3.2 Time domains

The time system provides four domains for different use cases. PhysicalClock uses wall-clock time for cooldowns, receipts, and liveness. LogicalClock uses vector and Lamport clocks for causal ordering. OrderClock uses opaque tokens for deterministic ordering without temporal leakage. Range uses earliest and latest bounds for validity windows.

Time access happens exclusively through effect traits. Application code does not call system time directly. Cross-domain comparisons require explicit policy. See Effect System for time domain details.

3.3 Context propagation

Every async call chain carries an EffectContext that identifies the authority, context, session, and execution mode. Guards access context to make authorization decisions. Handlers access context to route operations to the correct namespace. See Effect System for the context model.

4. Authorization and Enforcement

4.1 Guard chain

All transport sends pass through a guard chain before any network effect. The chain enforces authorization, budget accounting, journal coupling, and leakage tracking in a fixed sequence:

CapabilityGuardFlowBudgetGuardJournalCouplingGuardLeakageTrackingGuardTransportEffects

Each guard must succeed before the next executes. Failure at any guard blocks the send. This order enforces the charge-before-send invariant. See Authorization for the full guard chain specification.

4.2 Capability model

Authorization uses Biscuit tokens with cryptographic attenuation. Capabilities can only be restricted, never expanded. Delegation chains are verifiable without contacting the issuer.

CapabilityGuard evaluates Biscuit tokens against required capabilities. It verifies that the sender has authority to perform the requested operation. Biscuit caveats can restrict scope, time, or target. See Authorization for token structure and evaluation.

4.3 Flow budgets and receipts

Flow budgets track message emission per context and peer. Only spent and epoch values are stored as facts. The limit is computed at runtime from capability evaluation through the meet-semilattice. This keeps replicated state minimal while enabling runtime limit computation.

FlowBudgetGuard charges budgets and emits receipts before each send. Budget charges are atomic with receipt generation. If spent + cost > limit, the send is blocked locally with no observable behavior. Epoch rotation resets counters through new epoch facts.

Receipts include context, sender, receiver, epoch, cost, and a hash chain link. The chain provides accountability for multi-hop message forwarding. Relays validate upstream receipts before forwarding and charge their own budgets before emitting. See Transport and Information Flow for receipt verification.

4.4 Context isolation and leakage tracking

Contexts provide information flow boundaries. Keys are derived per-context. Facts are scoped to namespaces. Cross-context flow requires explicit bridge protocols.

LeakageTrackingGuard records privacy budget usage per observer class. Observer classes include relationship, group, neighbor, and external. Operations that exceed leakage budgets are blocked. See Privacy and Information Flow Contract for the leakage model.

5. Protocols

5.1 Choreographic protocols

Choreographies define global protocols using multi-party session types. A global type describes the entire protocol from an overview perspective. Each message specifies sender, receiver, payload type, and guard annotations.

Annotations compile into guard chain requirements: guard_capability is the canonical namespaced capability string admitted at the DSL boundary, flow_cost specifies budget charges, journal_facts specifies facts to commit, and leak specifies leakage budget allocation.

Projection extracts each role's local view from the global type. The local view specifies what messages the role sends and receives. Execution interprets the local view against the effect system. See MPST and Choreography for projection rules and the global type grammar.

5.2 Telltale Protocol Machine

Production choreography execution uses the Telltale protocol machine with a host bridge. Startup is manifest-driven and admitted by construction. Execution is bounded by deterministic step budgets derived from weighted measures of the local session type. The budget removes wall-clock coupling from safety enforcement and keeps bound checks replay-deterministic across native and WASM conformance lanes.

The protocol machine supports canonical, hardening, and parity profiles. The canonical profile runs at concurrency 1 as the reference behavior. Hardening profiles test edge cases. Parity profiles compare native and WASM execution. See MPST and Choreography for runtime details.

5.3 Consensus and agreement

Aura Consensus provides single-shot agreement for non-monotone operations. Monotone operations use CRDT merge without consensus. Non-monotone operations such as key rotation, membership changes, and authoritative state transitions require consensus.

Operations are classified into categories A, B, and C. Category A uses CRDTs with immediate local effect. Category B shows pending state until agreement. Category C blocks until the consensus ceremony completes. See Operation Categories for classification rules.

The fast path completes in one round trip when witnesses agree on prestate. The fallback path activates when witnesses disagree or the initiator stalls. Bounded gossip propagates evidence until a quorum forms. Both paths yield the same CommitFact format.

CommitFact represents a consensus decision. It binds prestate hash, operation hash, participant set, threshold signature, and timestamp. CommitFacts are inserted into the relevant journal namespace. The prestate binding prevents reusing signatures across unrelated operations. Consensus failures fall back to gossip. Network partitions delay but do not corrupt state. See Consensus for protocol details.

5.4 Invitation lifecycle

Invitations establish new relationships. Contact invitations create direct messaging contexts. Channel invitations grant access to home channels. Guardian invitations bind recovery relationships.

Invitation creation is authorization-gated. Only the sender can cancel. Only the receiver can accept or decline. No invitation is resolved twice. Terminal states (accepted, declined, cancelled, expired) are immutable. Accepted invitations are backed by journal facts. Ceremony initiation is gated on acceptance. See Relational Contexts for invitation patterns.

5.5 Recovery

Device recovery uses guardian protocols. Guardians hold encrypted recovery shares established through relational contexts. A threshold of guardians can restore account access by contributing their shares.

Recovery operates through the same consensus and session-type infrastructure as other protocols. The recovered authority retains its identity while rotating to new key material. See Relational Contexts for recovery architecture.

6. Communication

6.1 Secure channels

SecureChannel provides encrypted, authenticated communication between two authorities. Channels use context-scoped keys derived through deterministic key derivation. Channel state is not stored in journals. Channels are established through rendezvous or direct connection. See Rendezvous Architecture for channel establishment.

6.2 Rendezvous

Rendezvous enables authorities to find each other without centralized directories. Rendezvous servers are untrusted relays that cannot read message content. Authorities publish encrypted envelopes that peers can retrieve. The social topology provides routing hints based on home and neighborhood membership. See Rendezvous Architecture for the full protocol.

6.3 Asynchronous messaging

AMP provides patterns for reliable asynchronous messaging. Messages may arrive out of order. Delivery may be delayed by offline peers. AMP handles acknowledgment, retry, and ordering. Channels support both synchronous request-response and asynchronous fire-and-forget patterns. See Aura Messaging Protocol for details.

6.4 Anti-entropy

Journal state converges through anti-entropy after network partitions. Each peer periodically exchanges fact digests with its neighbors, identifies gaps, and selectively transfers missing facts. Because journals are CRDTs, merging facts from any peer is safe regardless of ordering. This process runs continuously in the background without coordination or agreement and ensures that connected peers eventually share the same fact set.

7. Runtime and Ownership

7.1 Ownership model

Aura uses four ownership categories to prevent multiple layers from co-owning the same semantic truth: Pure for reducers and validators, MoveOwned for handle and session transfer, ActorOwned for long-lived mutable runtime state, and Observed for projections and UI reads. Parity-critical mutation must be capability-gated. Parity-critical operations must terminate explicitly with typed success, failure, or cancellation. Errors are classified by recoverability and propagated through Result types. See Ownership Model for the full contract.

7.2 Structured concurrency

Actor-owned state is managed through a hierarchical task supervisor. Each service owns a rooted task group. Child tasks inherit cancellation from parents. Shutdown is hierarchical and parent-driven. All mutation of actor-owned state flows through bounded typed ingress rather than shared mutable access.

Session and endpoint transfer uses move-owned capabilities with monotone generation counters that reject stale access. Delegation atomically transfers the owner record and capability. This separation keeps supervision (who manages the lifecycle) distinct from session ownership (who may act on the state). See Runtime for the structured concurrency model.

7.3 Reactive state

The system uses reactive signals for state propagation. Journal fact changes flow through reducers to signals. Signals expose derived state to UI observers. The flow is unidirectional: facts are the source of truth, views are derived, and subscribers receive the latest state when they poll.

Subscription to an unregistered signal is a typed failure. Lagging subscribers may miss intermediate updates and resume from a newer snapshot. Reactive delivery is a transport for authoritative snapshots, not an alternate owner of semantic truth.

7.4 Workflow ownership

User-facing operations such as sending a message, accepting an invitation, or rotating a key are executed as workflows that progress through typed lifecycle phases to a terminal outcome. Each workflow has one authoritative lifecycle owner. Frontend and harness layers may submit commands and observe results, but they do not publish terminal truth. Ownership transfers through explicit handoff before the workflow begins awaited work. See Ownership Model for the semantic owner protocol.

8. Maintenance and Evolution

8.1 Snapshots and garbage collection

Snapshots bound storage size. A snapshot proposal announces a target epoch and a digest of the journal prefix. Devices verify the digest and contribute threshold signatures to complete the snapshot. Devices then prune facts and blobs whose epochs fall below the snapshot epoch. This pruning does not affect correctness because the snapshot represents a complete prefix.

8.2 OTA upgrades

OTA separates release distribution from activation. Release propagation is multi-directional and eventual. Activation is scope-bound and uses explicit epoch fences. Soft forks preserve compatibility. Hard forks require threshold-signed activation ceremonies scoped to the affected authority or context. See Distributed Maintenance Architecture for the full upgrade model.

8.3 Epoch fencing

Epochs gate budget resets, receipt validity, and upgrade activation. Epoch rotation inserts a new epoch fact into the journal. All replicas treat an epoch change as effective once they observe it in the journal. This avoids hard clock synchronization requirements. Receipts are valid only within their epoch. Old epoch receipts cannot be replayed.

References

Theoretical Model

This document establishes the complete mathematical foundation for Aura's distributed system architecture. It presents the formal calculus, algebraic/session types, and semilattice semantics that underlie all system components.

Overview

Aura's theoretical foundation rests on four mathematical pillars:

  1. Aura Calculus () provides the core computational model for communication, state, and trust.
  2. Algebraic Types structure state as semilattices with monotonic properties.
  3. Multi-Party Session Types specify choreographic protocols with safety guarantees.
  4. CRDT Semantics enable conflict-free replication with convergence proofs.

The combination forms a privacy-preserving, spam-resistant, capability-checked distributed λ-calculus that enforces information flow budget across all operations.

Shared Terms and Notation

This section defines shared terminology and notation for core contracts. Use this section when writing Theoretical Model, Privacy and Information Flow Contract, and Distributed Systems Contract.

Core Entities

Symbol / TermTypeDescription
, AuthorityIdAuthority identifiers
ContextIdContext identifier
EpochEpoch number
authorityAn account authority
contextA relational or authority namespace keyed by ContextId
peer authorityA remote authority in a context
memberAn authority in a home's threshold authority set
participantAn authority granted home access but not in the threshold set
moderatorA member with moderation designation for a home

Access and Topology

TermDescription
FullSame-home access (0 hops)
Partial1-hop neighborhood access
Limited2-hop-or-greater and disconnected access
1-hop linkDirect home-to-home neighborhood edge
n-hopMulti-edge path (2-hop, 3-hop, etc.)
Shared StorageCommunity storage pool
pinnedFact attribute for retained content

Flow Budget and Receipts

Use for a flow budget for context and peer authority .

FieldType / StorageDescription
FlowBudget
spentReplicated factMonotone counter of consumed budget
epochReplicated factCurrent epoch identifier
limitDerived at runtimeFrom capability evaluation and policy
Receipt
ctxContextIdContext scope
srcAuthorityIdSending authority
dstAuthorityIdReceiving authority
epochEpochValidity window
costu32Budget charge
nonceu64Replay prevention
prev_hashHash32Links receipts in per-hop chain
sigSignatureCryptographic proof

Observers, Time, and Delay

SymbolCategoryDescription
ObserverRelationship (direct relationship observer)
ObserverGroup (group member observer)
ObserverNeighbor (neighborhood observer)
ObserverExternal (external/network observer)
CRDT/stateLocal state deltas (NOT network delay)
NetworkNetwork delay bounds under partial synchrony
GSTTimingGlobal Stabilization Time

Leakage tuple order:

Invariant Naming

Use InvariantXxx names for implementation and proof references. If a prose alias exists, include it once, then reference the invariant name.

Guard Chain Order

OrderComponentRole
1CapGuardCapability verification
2FlowGuardBudget enforcement
3JournalCouplerFact commitment
4TransportEffectsMessage transmission

This order defines ChargeBeforeSend.

1. Aura Calculus ()

1.1 Syntax

We define programs as effectful, session-typed processes operating over semilattice-structured state.

Terms:

Facts (Join-Semilattice):

Capabilities (Meet-Semilattice):

Contexts:

Contexts are opaque UUIDs representing authority journals or relational contexts. Keys for transport sessions and DKD outputs are scoped to these identifiers. The identifier itself never leaks participants. See Identifiers and Boundaries for canonical definitions.

Messages:

Message extraction functions (used by operational rules):

A process configuration:

This represents a running session with fact-state , capability frontier derived from verified Biscuit tokens and local policy, and privacy context .

1.2 Judgments

Under typing context , expression has type and may perform effects .

Effect set:

1.3 Operational Semantics

State evolution:

Capability-guarded actions:

Each side effect or message action carries a required capability predicate .

The function attn applies the Biscuit token's caveats to the local frontier. Biscuit attenuation never widens authority. The operation remains meet-monotone even though the token data lives outside the journal.

Context isolation:

No reduction may combine messages of distinct contexts:

Here is shorthand for the budget predicate derived from journal facts and Biscuit-imposed limits for context :

Implementations realize this by merging a FlowBudget charge fact before send (see §2.3 and §5.3) while evaluating Biscuit caveats inside the guard chain. The side condition is enforced by the same monotone laws as other effects even though capability data itself is not stored in the CRDT.

1.4 Algebraic Laws (Invariants)

  1. Monotonic Growth:
  2. Monotonic Restriction:
  3. Safety: Every side effect requires .
  4. Context Separation: For any two contexts , no observable trace relates their internal state unless a bridge protocol is typed for .
  5. Compositional Confluence: and

2. Core Algebraic Types

2.1 Foundation Objects

#![allow(unused)]
fn main() {
// Capabilities describe Biscuit caveats. They form a meet-semilattice but are evaluated outside the CRDT.
type Cap        // partially ordered set (≤), with meet ⊓ and top ⊤
type Policy     // same carrier as Cap, representing sovereign policy

// Facts are join-semilattice elements (accumulation only grows them).
type Fact       // partially ordered set (≤), with join ⊔ and bottom ⊥

// Journal state is only a Cv/Δ/CmRDT over facts.
struct Journal {
  facts: Fact,            // Cv/Δ/CmRDT carrier with ⊔
}

// Context identifiers are opaque (authority namespace or relational context).
struct ContextId(Uuid);
struct Epoch(u64);  // monotone, context-scoped
struct FlowBudget { limit: u64, spent: u64, epoch: Epoch };
struct Receipt { ctx: ContextId, src: AuthorityId, dst: AuthorityId, epoch: Epoch, cost: u32, nonce: u64, prev_hash: Hash32, sig: Signature };

// Typed messages carry effects and proofs under a context.
struct Msg<Ctx, Payload, Version> {
  ctx: Ctx,                 // ContextId chosen by relationship
  payload: Payload,         // typed by protocol role/state
  ver: Version,             // semantic version nego
  auth: AuthTag,            // signatures/MACs/AEAD tags
  biscuit: Option<Biscuit>, // optional attenuated capability
}
}

These type definitions establish the foundation for Aura's formal model. The Cap and Policy types form meet-semilattices for capability evaluation. The Fact type forms a join-semilattice for accumulating evidence. The Journal contains only facts and inherits join-semilattice properties. Messages are typed and scoped to contexts to ensure isolation.

The type Cap represents the evaluation lattice used by Biscuit. Refinement operations (caveats, delegation) can only reduce authority through the meet operation.

The type Fact represents facts as join-semilattice elements. Accumulation operations can only add information through the join operation.

Journals replicate only facts. Capability evaluations run locally by interpreting Biscuit tokens plus policy. This keeps authorization independent of the replicated CRDT while preserving the same meet monotonicity at runtime.

Contexts (ContextId) define privacy partitions. Messages never cross partition boundaries without explicit protocol support. See Identifiers and Boundaries for precise identifier semantics and Relational Contexts for implementation patterns.

2.2 Content Addressing Contract

All Aura artifacts are identified by the hash of their canonical encoding. This includes facts, snapshot blobs, cache metadata, and upgrade manifests.

Structures are serialized using canonical CBOR with sorted maps and deterministic integer width. The helper function hash_canonical(bytes) computes digests when needed.

Once a digest is published, the bytes for that artifact cannot change. New content requires a new digest and a new fact in the journal.

Snapshots and upgrade bundles stored outside the journal are referenced solely by their digest. Downloaders verify the digest before accepting the payload. Journal merges compare digests and reject mismatches before updating state. See Distributed Maintenance Architecture for the complete fact-to-state pipeline.

2.3 Effect Signatures

Core effect families provide the runtime contract:

-- Read/append mergeable state
class JournalEffects (m : Type → Type) where
  read_facts  : m Fact
  merge_facts : Fact → m Unit

-- Biscuit verification + guard evaluation
class AuthorizationEffects (m : Type → Type) where
  evaluate_guard : Biscuit → CapabilityPredicate → m Cap
  derive_cap     : ContextId → m Cap  -- cached policy frontier

-- Cryptography and key mgmt (abstracted to swap FROST, AEAD, DR, etc.)
class CryptoEffects (m : Type → Type) where
  sign_threshold  : Bytes → m SigWitness
  aead_seal       : K_box → Plain → m Cipher
  aead_open       : K_box → Cipher → m (Option Plain)
  commitment_step : ContextId → m ContextId

-- Transport (unified)
class TransportEffects (m : Type → Type) where
  send    : PeerId → Msg Ctx P V → m Unit
  recv    : m (Msg Ctx Any V)
  connect : PeerId → m Channel

These effect signatures define the interface between protocols and the runtime. The JournalEffects family handles state operations. The AuthorizationEffects family verifies Biscuit tokens and fuses them with local policy. The CryptoEffects family handles cryptographic operations. The TransportEffects family handles network communication.

2.4 Guards and Observability Invariants

Every observable side effect is mediated by a guard chain fully described in Authorization:

  1. CapGuard:
  2. FlowGuard: headroom(ctx, cost) where charge(ctx, peer, cost, epoch) succeeds and yields a Receipt
  3. JournalCoupler: commit of attested facts is atomic with the send

Named invariants used across documents:

  • Charge-Before-Send: FlowGuard must succeed before any transport send.
  • No-Observable-Without-Charge: there is no event without a preceding successful .
  • Deterministic-Replenishment: limit(ctx) is computed deterministically from capability evaluation. The value spent (stored as journal facts) is join-monotone. Epochs gate resets.
-- Time & randomness for simulation/proofs
class TimeEffects (m : Type → Type) where
  now   : m Instant
  sleep : Duration → m Unit

class RandEffects (m : Type → Type) where
  sample : Dist → m Val

-- Privacy budgets (relationship, group, neighbor, external observers)
class LeakageEffects (m : Type → Type) where
  record_leakage   : ObserverClass → Number → m Unit
  remaining_budget : ObserverClass → m Number

The TimeEffects and RandEffects families support simulation and testing. The LeakageEffects family enforces privacy budget constraints.

The LeakageEffects implementation is the runtime hook that enforces the annotations introduced in the session grammar. The system wires it through the effect system so choreographies cannot exceed configured budgets.

Information Flow Budgets (Spam + Privacy)

Each context and peer authority pair carries a flow budget to couple spam resistance with privacy guarantees. This pair is written as . Keys and receipts are authority-scoped. Devices are internal to an authority and never appear in flow-budget state.

#![allow(unused)]
fn main() {
struct FlowBudget {
    spent: u64,   // monotone counter (join = max)
    limit: u64,   // capability-style guard (meet = min)
}

// Logical key: (ContextId, AuthorityId) -> FlowBudget
// Receipts: (ctx, src_authority, dst_authority, epoch, cost, nonce)
}

The FlowBudget struct tracks message emission through two components:

  • The spent field is stored in the journal as facts and increases through join operations
  • The limit field is derived from capability evaluation (Biscuit tokens + local policy) and decreases through meet operations

Only spent counters live in the journal as facts, inheriting join-semilattice properties. The limit is computed at runtime by evaluating Biscuit tokens and local policy through the capability meet-semilattice, consistent with the principle that capabilities are evaluated outside the CRDT.

Sending a message deducts a fixed flow_cost from the local budget before the effect executes. If , the effect runtime blocks the send.

Budget charge facts (incrementing spent) are emitted to the journal during send operations. The limit value is deterministically computed from the current capability frontier, ensuring all replicas with the same tokens and policy derive the same limit. Receipts bind to AuthorityId for both src/dst and use nonce chains per (ctx, src, epoch) to maintain charge-before-send proofs without leaking device structure.

Multi-hop forwarding charges budgets hop-by-hop. Relays attach a signed Receipt that proves the previous hop still had headroom. Receipts are scoped to the same context so they never leak to unrelated observers.

2.5 Semantic Laws

Join laws apply to facts. These operations are associative, commutative, and idempotent. If and after we have , then with respect to the facts partial order.

Meet laws apply to capabilities. These operations are associative, commutative, and idempotent. Applying Biscuit caveats corresponds to multiplying by under and never increases authority.

Cap-guarded effects enforce non-interference. For any effect guarded by capability predicate , executing from is only permitted if .

Context isolation prevents cross-context flow. If two contexts are not explicitly bridged by a typed protocol, no flows into .

3. Multi-Party Session Type Algebra

3.1 Global Type Grammar (G)

The global choreography type describes the entire protocol from a bird's-eye view. Aura extends vanilla MPST with capability guards, journal coupling, and leakage budgets:

Conventions:

  • means "role checks , applies , records leakage , sends to , then continues with ."
  • performs the same sequence for broadcasts.
  • means "execute and concurrently."
  • means "role decides which branch to take, affecting all participants."
  • binds recursion variable in .

Note on : the journal delta may include budget-charge updates (incrementing spent for the active epoch) and receipt acknowledgments. Projection ensures these updates occur before any transport effect so "no observable without charge" holds operationally.

Note on : check_caps and refine_caps are implemented via AuthorizationEffects. Sends verify Biscuit chains (with optional cached evaluations) before touching transport. Receives cache new tokens by refining the local capability frontier. Neither operation mutates the journal. All capability semantics stay outside the CRDT.

3.2 Local Type Grammar (L)

After projection, each role executes a local session type (binary protocol) augmented with effect sequencing:

3.3 Projection Function ()

The projection function extracts role 's local view from global choreography :

By convention, an annotation at a global step induces per-side deltas and . Unless otherwise specified by a protocol, we take (symmetric journal updates applied at both endpoints).

Point-to-point projection:

  • If (sender):
  • If (receiver):
  • Otherwise:

Broadcast projection:

  • If (sender):
  • Otherwise (receiver):

Parallel composition:

where is the merge operator (sequential interleaving if no conflicts)

Choice projection:

  • If (decider):
  • If (observer):

Recursion projection:

  • If :
  • If :

Base cases:

3.4 Duality and Safety

For binary session types, duality ensures complementary behavior:

Property: If Alice's local type is , then Bob's local type is for their communication to be type-safe.

3.5 Session Type Safety Guarantees

The projection process ensures:

  1. Deadlock Freedom: No circular dependencies in communication
  2. Type Safety: Messages have correct types at send/receive
  3. Communication Safety: Every send matches a receive
  4. Progress: Protocols always advance (no livelocks)
  5. Agreement: All participants agree on the chosen branch and protocol state (modulo permitted interleavings of independent actions)

3.6 Turing Completeness vs Safety Restrictions

The MPST algebra is Turing complete when recursion () is unrestricted. However, well-typed programs intentionally restrict expressivity to ensure critical safety properties:

  • Termination: Protocols that always complete (no infinite loops)
  • Deadlock Freedom: No circular waiting on communication
  • Progress: Protocols always advance to next state

Telltale-backed session runtimes balance expressivity and safety through guarded recursion constructs.

3.7 Free Algebra View (Choreography as Initial Object)

You can think of the choreography language as a small set of protocol-building moves:

Generators:

  • and

Taken together, these moves form a free algebra: the language carries just enough structure to compose protocols, but no extra operational behavior. The effect runtime is the target algebra that gives these moves concrete meaning.

Projection (from a global protocol to each role) followed by interpretation (running it against the effect runtime) yields one canonical way to execute any choreography.

The "free" (initial) property is what keeps this modular. Because the choreographic layer only expresses structure, any effect runtime that respects those composition laws admits exactly one interpretation of a given protocol. This allows swapping or layering handlers without changing choreographies.

The system treats computation and communication symmetrically. A step is the same transform whether it happens locally or across the network. If the sender and receiver are the same role, the projection collapses the step into a local effect call. If they differ, it becomes a message exchange with the same surrounding journal/guard/leak actions. Protocol authors write global transforms, the interpreter decides local versus remote at time of projection.

3.8 Algebraic Effects and the Interpreter

Aura treats protocol execution as interpretation over an algebraic effect interface. After projecting a global choreography to each role, a polymorphic interpreter walks the role's AST and dispatches each operation to AuraEffectSystem via explicit effect handlers. The core actions are exactly the ones defined by the calculus and effect signatures in this document: merge (facts grow by ), refine (caps shrink by ), send/recv (context-scoped communication), and leakage/budget metering. The interpreter enforces the lattice laws and guard predicates while executing these actions in the order dictated by the session type.

Because the interface is algebraic, there is a single semantics regardless of execution strategy. This enables two interchangeable modes:

  • Static compilation: choreographies lower to direct effect calls with zero runtime overhead.
  • Dynamic interpretation: choreographies execute through the runtime interpreter for flexibility and tooling.

Both preserve the same program structure and checks. The choice becomes an implementation detail. This also captures the computation/communication symmetry. A choreographic step describes a typed transform. If the sender and receiver are the same role, projection collapses the step to a local effect invocation. If they differ, the interpreter performs a network send/receive with the same surrounding merge/check_caps/refine/record_leak sequence. Protocol authors reason about transforms. The interpreter decides locality at projection time.

4. CRDT Semantic Foundations

4.1 CRDT Type System

Aura implements four CRDT variants to handle different consistency requirements.

#![allow(unused)]
fn main() {
// State-based (CvRDT) - Full state synchronization
pub trait JoinSemilattice: Clone { fn join(&self, other: &Self) -> Self; }
pub trait Bottom { fn bottom() -> Self; }
pub trait CvState: JoinSemilattice + Bottom {}

// Delta CRDTs - Incremental state synchronization
pub trait Delta: Clone { fn join_delta(&self, other: &Self) -> Self; }
pub trait DeltaProduce<S> { fn delta_from(old: &S, new: &S) -> Self; }

// Operation-based (CmRDT) - Causal operation broadcast
pub trait CausalOp { type Id: Clone; type Ctx: Clone; fn id(&self) -> Self::Id; fn ctx(&self) -> &Self::Ctx; }
pub trait CmApply<Op> { fn apply(&mut self, op: Op); }
pub trait Dedup<I> { fn seen(&self, id: &I) -> bool; fn mark_seen(&mut self, id: I); }

// Meet-based CRDTs - Constraint propagation
pub trait MeetSemilattice: Clone { fn meet(&self, other: &Self) -> Self; }
pub trait Top { fn top() -> Self; }
pub trait MvState: MeetSemilattice + Top {}
}

The type system enforces mathematical properties. CvState types must satisfy associativity, commutativity, and idempotency for join operations. MvState types satisfy the same laws for meet operations.

4.2 Convergence Proofs

Each CRDT type provides specific convergence guarantees:

CvRDT Convergence: Under eventual message delivery, all replicas converge to the least upper bound of their updates.

Proof sketch: Let states be replica states after all updates. The final state is . By associativity and commutativity of join, this value is unique regardless of message ordering.

Delta-CRDT Convergence: Equivalent to CvRDT but with bandwidth optimization through incremental updates.

CmRDT Convergence: Under causal message delivery and deduplication, all replicas apply the same set of operations.

Proof sketch: Causal delivery ensures operations are applied in a consistent order respecting dependencies. Deduplication prevents double-application. Commutativity ensures that concurrent operations can be applied in any order with the same result.

Meet-CRDT Convergence: All replicas converge to the greatest lower bound of their constraints.

4.3 Authority-Specific Applications

Authority journals use join-semilattice facts for evidence accumulation. Facts include AttestedOp records proving commitment tree operations occurred. Multiple attestations of the same operation join to the same fact.

Relational context journals use both join operations for shared facts and meet operations for consensus constraints. The prestate model ensures all authorities agree on initial conditions before applying operations.

Biscuit capability evaluation uses meet operations outside the CRDT. Token caveats restrict authority through intersection without affecting replicated state.

4.4 Message Schemas for Authority Synchronization

#![allow(unused)]
fn main() {
// Tagged message types for different synchronization patterns
#[derive(Clone)] pub enum SyncMsgKind { FullState, Delta, Op, Constraint }

pub type AuthorityStateMsg = (AuthorityFacts, SyncMsgKind);
pub type ContextStateMsg = (ContextFacts, SyncMsgKind);
pub type FactDelta = (Vec<Fact>, SyncMsgKind);

#[derive(Clone)] pub struct AuthenticatedOp { 
    pub op: TreeOp, 
    pub attestation: ThresholdSig, 
    pub ctx: ContextId 
}

// Anti-entropy protocol support
pub type FactDigest = Vec<Hash32>;
pub type MissingFacts = Vec<Fact>;
}

These message types support different synchronization strategies. Authority namespaces primarily use delta synchronization for efficiency. Relational context namespaces use operation-based synchronization to preserve consensus ordering.

4.5 Authority Synchronization Protocols

Authority Fact Synchronization: Authorities exchange facts using delta-based gossip for efficiency.

Relational context consensus: Contexts use operation-based synchronization to preserve consensus ordering.

Anti-Entropy Recovery: Peers detect missing facts through digest comparison and request specific items.

These protocols ensure eventual consistency across authority boundaries while maintaining context isolation. Facts are scoped to their originating authority or relational context.

4.6 Implementation Verification

Aura provides property verification for CRDT implementations through several mechanisms.

Property-Based Testing: Implementations include QuickCheck properties verifying semilattice laws. Tests generate random sequences of operations and verify associativity, commutativity, and idempotency.

Convergence Testing: Integration tests simulate network partitions and verify that replicas converge after partition healing. Tests measure convergence time and verify final state consistency.

Formal Verification: Critical CRDT implementations include formal proofs using the Quint specification language. Proofs verify that implementation behavior matches mathematical specifications.

Safety Guarantees: Session type projection ensures communication safety and deadlock freedom. Semilattice laws guarantee convergence under eventual message delivery. Meet operations ensure that capability restrictions are monotonic and cannot be bypassed.

5. Information Flow Contract (Privacy + Spam)

5.1 Privacy Layers

For any trace of observable messages:

  1. Unlinkability:
  2. Non-amplification: Information visible to observer class is monotone in authorized capabilities:

  1. Leakage Bound: For each observer , .
  2. Flow Budget Soundness (Named):
    • Charge-Before-Send
    • No-Observable-Without-Charge
    • Deterministic-Replenishment
    • Convergence: Within a fixed epoch and after convergence, across replicas .

5.2 Web-of-Trust Model

Let where vertices are accounts. Edges carry relationship contexts and delegation fragments.

  • Each edge defines a pairwise context with derived keys
  • Delegations are meet-closed elements , scoped to contexts
  • The effective capability at is:

WoT invariants:

  • Compositionality: Combining multiple delegations uses (never widens)
  • Local sovereignty: is always in the meet. can only reduce authority further
  • Projection: For any protocol projection to , guard checks refer to

5.3 Flow Budget Contract

The unified information-flow budget regulates emission rate/volume and observable leakage. The budget system combines join-semilattice facts (for spent counters) with meet-semilattice capability evaluation (for limit computation). For any context and peer :

  1. Charge-Before-Send: A send or forward is permitted only if a budget charge succeeds first. If charging fails, the step blocks locally and emits no network observable.
  2. No-Observable-Without-Charge: For any trace , there is no event labeled without a preceding successful charge for in the same epoch.
  3. Receipt soundness: A relay accepts a packet only with a valid per-hop Receipt (context-scoped, epoch-bound, signed) and sufficient local headroom. Otherwise it drops locally.
  4. Deterministic limit computation: is computed deterministically from Biscuit tokens and local policy via meet operations. is stored as journal facts and is join-monotone. Upon epoch rotation, resets through new epoch facts.
  5. Context scope: Budget facts and receipts are scoped to . They neither leak nor apply across distinct contexts (non-interference).
  6. Composition with caps: A transport effect requires both and (see §1.3). Either guard failing blocks the effect.
  7. Convergence bound: Within a fixed epoch and after convergence, across replicas , where each replica's limit is computed from its local capability evaluation.

6. Application Model

Every distributed protocol is defined as a multi-party session type with role projections:

When executed, each role instantiates a handler:

Handlers compose algebraically over by distributing operations over semilattice state transitions. This yields an effect runtime capable of:

  • key-ceremony coordination (threshold signatures)
  • gossip and rendezvous (context-isolated send/recv)
  • distributed indexing (merge facts, meet constraints)
  • garbage collection (join-preserving retractions)

7. Interpretation

Under this calculus, we can make the following interpretation:

The Semilattice Layer

The join-semilattice (Facts) captures evidence and observations (trust and information flow). Examples: delegations/attestations, quorum proofs, ceremony transcripts, flow receipts, and monotone spent counters.

The meet-semilattice (Capabilities) captures enforcement limits and constraints (trust and information flow). Examples: the sovereign policy lattice, Biscuit token caveats, leak bounds, and consent gates. See Authorization for implementation details. Flow budget limits are derived from capability evaluation, not stored as facts. This lattice is evaluated locally rather than stored in the journal, but it obeys the same algebra.

Effective authority and headroom are computed from both lattices:

The Session-Typed Process Layer

This layer guarantees communication safety and progress. It projects global types with annotations into local programs, ensuring deadlock freedom, communication safety, branch agreement, and aligning capability checks, journal updates, and leakage accounting with each send/recv.

The Effect Handler Layer

The Effect Handler system provides operational semantics and composability. It realizes merge/refine/send/recv as algebraic effects, enforces lattice monotonicity ( facts, caps), guard predicates, and budget/leakage metering, and composes via explicit dependency injection across crypto, storage, and transport layers.

The Privacy Contract

The privacy contract defines which transitions are observationally equivalent. Under context isolation and budgeted leakage, traces that differ only by in-context reorderings or by merges/refinements preserving observer-class budgets and effective capabilities are indistinguishable. No cross-context flow occurs without a typed bridge.

Together, these form a privacy-preserving, capability-checked distributed λ-calculus.

System Contracts

See Also

Privacy and Information Flow Contract

This contract specifies Aura's privacy and information-flow model. It defines privacy boundaries, leakage budgets, and required privacy properties. Privacy boundaries align with social relationships rather than technical perimeters.

Violations occur when information crosses trust boundaries without consent. Acceptable flows consume explicitly budgeted headroom.

This document complements Distributed Systems Contract, which covers safety, liveness, and consistency. Together these contracts define the full set of invariants protocol authors must respect.

Verification of these properties uses Quint model checking (verification/quint/) and Lean 4 theorem proofs (verification/lean/). See Verification Coverage Report for current status.

1. Scope

The contract applies to information flows across privacy boundaries:

  • Flow budgets: Per-context per-peer spending limits enforced before transport
  • Leakage tracking: Metadata exposure accounting by observer class
  • Context isolation: Separation of identities and journals across contexts
  • Receipt chains: Multi-hop forwarding accountability
  • Epoch boundaries: Temporal isolation of budget and receipt state
  • Service families: Establish, Move, and Hold as the privacy-relevant service surfaces
  • Selector retrieval: Capability-derived retrieval without identity-addressed mailbox polling
  • Hold custody: Neighborhood-scoped opaque retention with bounded retrieval authority

Related specifications: Authorization, Transport and Information Flow, and Theoretical Model. Shared notation appears in Theoretical Model.

1.1 Terminology Alignment

This contract uses shared terminology from Theoretical Model.

  • Home role terms: Member, Participant, Moderator (only members can be moderators)
  • Access-level terms: Full, Partial, Limited
  • Storage terms: Shared Storage and allocation
  • Pinning term: pinned as a fact attribute

1.2 Contract Vocabulary

  • observer: any party that can learn information from traffic, custody, or local state exposure
  • authoritative: part of replicated truth rather than local runtime interpretation
  • retrieval: recovery of an object through bounded authority rather than identity-addressed delivery
  • custody: opaque best-effort retention of a non-authoritative object
  • accountability evidence: bounded evidence used to verify that a service action occurred

1.3 Assumptions

  • Cryptographic primitives are secure at configured key sizes.
  • Local runtimes enforce guard-chain ordering before transport sends.
  • Epoch updates and budget facts eventually propagate through anti-entropy.
  • The service-family model is part of the active privacy contract.
  • Privacy-mode deployments use encrypted envelopes and the fixed adaptive policy.
  • Debug and simulation modes are excluded from production privacy claims.

1.4 Non-goals

  • This contract does not guarantee traffic-analysis resistance against global passive adversaries without encrypted envelopes and sufficiently regular cover behavior.
  • This contract does not define social policy decisions such as who should trust whom.
  • This contract does not treat Hold custody as authoritative durable storage.
  • This contract does not guarantee durable delivery from custody services.
  • This contract does not guarantee that debug or simulation modes preserve production privacy properties.

2. Privacy Philosophy

Traditional privacy systems offer only complete isolation or complete exposure. Aura treats privacy as relational. Sharing information with trusted parties is a consented disclosure, not a privacy violation.

2.1 Core Principles

  • Consensual disclosure: Joining a group or establishing a relationship implies consent to share coordination information
  • Contextual identity: Deterministic Key Derivation presents different identities in different contexts, and only relationship parties can link them
  • Neighborhood visibility: Gossip neighbors observe encrypted envelope metadata, bounded by flow budgets and context isolation
  • Service trust is social: social planes may admit or weight providers, but provider trust must not become visible service shape
  • Communication privacy is envelope-level: descriptors, routes, retrieval, and retention behavior must remain socially neutral at the network boundary

2.2 Privacy Layers

BoundaryRequired PropertyForbidden Outcome
IdentityContexts use distinct identity materialCross-context identity reuse
RelationshipRelationship discovery stays decentralizedGlobal directory disclosure
GroupGroup participation remains group-scopedCross-group membership disclosure
ContentUnauthorized observers cannot learn plaintextPlaintext disclosure outside consented boundaries
MetadataExposure stays budgeted by observer classUnbounded metadata leakage
RetrievalParity-critical retrieval is not identity-addressedMailbox-identity disclosure on retrieval paths
CustodyCustody remains opaque and non-authoritativeTreating custody as replicated truth

2.3 Service-Family Boundary

Establish, Move, and Hold are the privacy-relevant service families. They describe service behavior, not social role. A provider may be admitted because of neighborhood membership, direct friendship, bounded introduction evidence, or descriptor fallback, but the service interface must not reveal which reason dominated.

Trust evidence may affect Permit and runtime-local weighting. It must not appear as route shape, descriptor kind, retrieval shape, retention tier, or wire-visible policy class. Coarse selection tiers are local runtime derivations and are not canonical shared state.

3. Budgeted Send Invariant

Transport observables require prior local authorization, accounting, and fact-coupling.

Budget state is monotone within its active epoch. Over-budget sends must remain local. Receipt validity is epoch-scoped and old receipts must not be replayable in new epochs.

Forwarding is hop-local. Each hop must validate the required upstream accountability state before emitting downstream transport.

4. Leakage Tracking

4.1 Observer Classes

Information leakage is tracked per observer class:

ObserverMay ObserveMust Not Learn
RelationshipFull context content by consentUndisclosed contexts
GroupGroup-scoped contentOther group memberships by default
NeighborEncrypted envelope metadataPlaintext content and endpoint identity
CustodyOpaque held objects, selectors, and retention behaviorDepositor identity and mailbox identity
ExternalNetwork-level patternsProtected content and protected endpoint identity

4.2 Leakage Budget

Each observer class has a leakage budget separate from flow budgets.

Leakage is charged before any operation that exposes information to the observer class.

Custody observers are special. They may see that an opaque object is deposited, retained, expired, or retrieved. They must not learn mailbox identity or depositor identity from that flow.

4.3 Policy Modes

PolicyRequired BehaviorAllowed Exposure
DenyReject unbudgeted exposureNone
DefaultBudget(n)Apply bounded default headroomUp to n units
LegacyPermissiveAllow unbounded exposureMigration-only exception

5. Privacy Boundaries

5.1 Relationship Boundary

Within a direct relationship, both parties have consented to share coordination information:

  • Visible: Context-specific identity, online status, message content
  • Hidden: Activity in other contexts, identity linkage across contexts
  • Violation: cross-context identity reuse or disclosure of undisclosed contexts

5.2 Neighborhood Boundary

Gossip neighbors forward encrypted traffic:

  • Visible: Envelope size (fixed), rotating rtags, timing patterns
  • Hidden: Content, ultimate sender/receiver, rtag-to-identity mapping
  • Violation: disclosure of plaintext content or protected endpoint identity

5.3 Group Boundary

Group participants share group-scoped information:

  • Visible: Member identities (within group), group content, threshold operations
  • Hidden: Member identities outside group, other group memberships
  • Violation: disclosure of unrelated group membership through group participation

5.4 External Boundary

External observers have no relationship with you:

  • In privacy mode: protected Aura traffic patterns and timing are visible
  • In passthrough mode: direct Aura connectivity is visible
  • Violation: treating basic availability deployment as a stronger privacy claim

5.5 Retrieval Boundary

Retrieval is not identity-addressed at the network boundary.

  • Visible to intermediaries: selector traffic shape and reply-path usage
  • Hidden from intermediaries: mailbox identity, semantic object meaning, and direct reverse identity
  • Violation: identity-addressed retrieval on parity-critical paths

5.6 Custody Boundary

Hold providers operate on opaque custody objects rather than authoritative truth.

  • Visible to the holder: bounded retention requests, opaque held objects, selector usage, and storage pressure
  • Hidden from the holder under onion routing: specific depositor identity and mailbox identity
  • Violation: treating custody state as authoritative truth or varying retention by social distance

6. Time Domain Semantics

Time handling affects privacy through leakage:

VariantPurposePrivacy Impact
PhysicalClockGuard charging, receipts, cooldownsLeaks wall-clock via receipts
LogicalClockCRDT causality, journal orderingNo wall-clock leakage
OrderClockPrivacy-preserving total orderOpaque tokens (no temporal meaning)
RangeValidity windows, disputesBounded uncertainty from physical
  • Cross-domain time comparison must be explicit.
  • Privacy-preserving flows must not expose physical time unless that exposure is part of the contract.

7. Adversarial Model

7.1 Direct Relationship

A party in a direct relationship may observe the full contents of that relationship context by consent.

  • Must not: learn undisclosed contexts or link identity across contexts
  • Contract boundary: relationship consent does not widen to other contexts

7.2 Group Insider

A group member may observe group-scoped activity by consent.

  • Must not: learn other group memberships or unrelated identities
  • Contract boundary: group visibility remains group-scoped

7.3 Gossip Neighbor

Devices forwarding traffic may observe protected metadata.

  • Must not: decrypt content, identify protected endpoints, or map routing tags to identity
  • Contract boundary: neighbor visibility remains budgeted metadata only

7.4 Network Observer

A network observer may observe connectivity and timing patterns.

  • Must not: gain stronger privacy guarantees than the active deployment mode provides
  • Contract boundary: privacy claims vary with the active protection mode

7.5 Compromised Device

A compromised device may reveal its local share and replicated state.

  • Must not: unilaterally satisfy threshold requirements or derive protected root material
  • Contract boundary: single-device compromise does not imply full authority compromise

8. External Observer Limits

Stronger privacy claims against external observers require sufficiently regular protected network behavior.

Basic availability deployments do not claim those stronger bounds. Routing and budgeting must also limit metadata concentration at any single relay or hub.

9. Required Properties

9.1 Identity and Key Separation

  • Contexts must not share reusable identity material.
  • Key derivation must remain domain-separated.
  • Keys must not be reused across contexts.

9.2 Transport Privacy

  • Protected transport must use authenticated encrypted envelopes.
  • Transport observables must remain budgeted by observer class.
  • Accountability return paths must not require direct reverse identity.

9.3 Send Authorization

  • No transport observable may occur without prior local authorization and accounting.
  • Failed authorization or charging must remain local.
  • Forwarding must validate the required receipt or accountability state before onward transport.

9.4 Retrieval and Custody

  • Parity-critical retrieval must use bounded retrieval authority.
  • Parity-critical retrieval must not use identity-addressed mailbox polling.
  • Hold custody must remain opaque and non-authoritative.
  • Uniform retention policy must not vary by social distance.
  • Applications that require guaranteed durability must use authoritative replicated state rather than Hold.

9.5 Accountability and Local Consequences

  • Accountability evidence must be verified by the relevant local verifier before any local consequence is applied.
  • Local scoring, reciprocal budget, and admission effects apply only after successful verification.
  • Accountability traffic must not become a new global visibility layer.

9.6 Secret Material and Error Channels

  • Secret material must not be stored in plaintext.
  • Guard failures must return bounded, typed errors only.
  • Error payloads must not include raw context payload, peer identity material, or decrypted content.
  • Remote peers must not infer internal failure causes beyond allowed protocol-level status codes.

10. References

Distributed Systems Contract covers safety, liveness, and consistency.

Theoretical Model covers the formal calculus and semilattice laws.

System Architecture describes runtime layering and the guard chain.

Authorization covers authorization, budgeting, and Biscuit integration.

Transport and Information Flow documents transport semantics and receipts.

Relational Contexts documents cross-authority state and context isolation.

Verification Coverage Report tracks formal verification status.

Distributed Systems Contract

This contract specifies Aura's distributed systems model. It defines the safety, liveness, and consistency guarantees provided by the architecture. It also documents the synchrony assumptions and adversarial capabilities the system tolerates.

This contract complements Privacy and Information Flow Contract, which focuses on metadata and privacy budgets. Together these contracts define the full set of invariants protocol authors must respect.

Formal verification of these properties uses Quint model checking (verification/quint/) and Lean 4 theorem proofs (verification/lean/). See Verification Coverage Report for current status.

1. Scope

The contract applies to the following aspects of the system.

Effect handlers and protocols operate within the 8-layer architecture described in System Architecture. Journals and reducers are covered by this contract. The journal specification appears in Authority and Identity and Journal. Aura Consensus is documented in Consensus.

Relational contexts and rendezvous flows fall under this contract. Relational contexts are specified in Relational Contexts. Transport semantics appear in Transport and Information Flow. Rendezvous flows are detailed in Rendezvous Architecture. Shared notation appears in Theoretical Model.

1.1 Terminology Alignment

This contract uses shared terminology from Theoretical Model.

  • Consensus role terms: witness for consensus attestation, signer for FROST share holders
  • Social-role terms: Member, Participant, Moderator
  • Access terms: AccessLevel (Full, Partial, Limited)
  • Topology terms: 1-hop and n-hop paths

1.2 Contract Vocabulary

  • authoritative: part of replicated truth and subject to convergence requirements
  • protocol object: an execution object that supports transport or coordination without becoming replicated truth
  • runtime-local: state owned by a local runtime and not treated as authoritative
  • accountability witness: bounded evidence that a service action occurred
  • custody failure: loss, eviction, or retrieval miss for a non-authoritative held object

1.3 Assumptions

  • Cryptographic primitives remain secure at configured parameters.
  • Partial synchrony eventually holds after GST, with bounded Δ_net.
  • Honest participants execute the guard chain in the required order.
  • Anti-entropy exchange eventually delivers missing facts to connected peers.

1.4 Non-goals

  • This contract does not provide global linearizability across all operations.
  • This contract does not guarantee progress during permanent partitions.
  • This contract does not guarantee metadata secrecy without privacy controls defined in Privacy and Information Flow Contract.
  • This contract does not guarantee that runtime-local caches reflect authoritative truth at all times.
  • This contract does not guarantee durable custody from Hold services.

1.5 Service Object Classes

Distributed behavior depends on three object classes with different contracts:

  • authoritative shared objects
  • transport and protocol objects
  • runtime-derived local state

Only authoritative shared objects participate in replicated truth. Transport and protocol objects support execution. Runtime-derived local state remains non-authoritative and local to the runtime that owns it.

2. Safety Guarantees

2.1 Journal CRDT Properties

Facts merge via set union and never retract. Journals satisfy the semilattice laws:

  • Commutativity: merge(j1, j2) ≡ merge(j2, j1)
  • Associativity: merge(merge(j1, j2), j3) ≡ merge(j1, merge(j2, j3))
  • Idempotence: merge(j, j) ≡ j

Reduction is deterministic. Identical fact sets produce identical states. No two facts share the same nonce within a namespace (InvariantNonceUnique).

2.2 Charge-Before-Send

Every transport observable is preceded by local authorization, accounting, and fact-coupling checks. No packet is emitted without a successful local decision.

Flow budgets satisfy monotonicity: charging never increases available budget (monotonic_decrease). Charging the exact remaining amount results in zero budget (exact_charge).

2.3 Consensus Agreement

For any pair (cid, prestate_hash) there is at most one commit fact (InvariantUniqueCommitPerInstance). Fallback gossip plus FROST signatures prevent divergent commits. Byzantine witnesses cannot force multiple commits for the same instance.

Commits require threshold participation (InvariantCommitRequiresThreshold). Equivocating witnesses are excluded from threshold calculations (InvariantEquivocatorsExcluded).

2.3.1 Fault Assumptions

Consensus safety depends on declared threshold and fault bounds.

Threshold signatures require the configured threshold of distinct valid shares. Safety requires fault assumptions that remain within the bound declared for the active ceremony. Different ceremonies may declare different admissible fault bounds.

2.4 Evidence CRDT

The evidence system tracks votes and equivocations as a grow-only CRDT:

  • Monotonicity: Votes and equivocator sets only grow under merge
  • Commit preservation: merge preserves existing commit facts
  • Semilattice laws: Evidence merge is commutative, associative, and idempotent

2.5 Equivocation Detection

The system detects witnesses who vote for conflicting results:

  • Soundness: Detection only reports actual equivocation (no false positives)
  • Completeness: All equivocations are detectable given sufficient evidence
  • Honest safety: Honest witnesses are never falsely accused

Types like HasEquivocated and HasEquivocatedInSet exclude conflicting shares from consensus. See Consensus.

2.6 FROST Threshold Signatures

Threshold signatures satisfy binding and consistency properties:

  • Share binding: Shares are cryptographically bound to (consensus_id, result_id, prestate_hash)
  • Threshold requirement: Aggregation requires at least k shares from distinct signers
  • Session consistency: All shares in an aggregation have the same session
  • Determinism: Same shares always produce the same signature

2.7 Context Isolation

Messages scoped to ContextId never leak into other contexts. Contexts may be explicitly bridged through typed protocols only. See Theoretical Model. Each authority maintains separate journals per context to enforce this isolation.

2.8 Transport Layer

Beyond context isolation, transport satisfies:

  • Flow budget non-negativity: Spent never exceeds limit (InvariantFlowBudgetNonNegative)
  • Sequence monotonicity: Message sequence numbers strictly increase (InvariantSequenceMonotonic)
  • Fact backing: Every sent message has a corresponding journal fact (InvariantSentMessagesHaveFacts)

2.9 Deterministic Reduction Order

Commitment tree operations resolve conflicts using the stable ordering described in Authority and Identity. This ordering is derived from the cryptographic identifiers and facts stored in the journal. Conflicts are always resolved in the same way across all replicas.

2.10 Receipt Chain

Multi-hop forwarding requires signed receipts. Downstream peers reject messages lacking a chain rooted in their relational context. See Transport and Information Flow. This prevents unauthorized message propagation.

2.11 Onion Accountability Verification

Onion-routed accountability must preserve anonymous reverse delivery of bounded witnesses.

Verifier roles are explicit and local. Local runtime consequences such as scoring, reciprocal budget, and admission preference apply only after verification succeeds.

Accountability return paths use typed single-use reply blocks. MoveReceiptReplyBlock, HoldDepositReplyBlock, HoldRetrievalReplyBlock, and HoldAuditReplyBlock are scoped proof-return capabilities. They are not generic reverse channels.

Witness traffic must return through the shared movement substrate when onion routing is active. Direct callback assumptions are not part of the privacy-mode contract.

2.12 Hold Service Profiles

Hold is a shared custody service surface. Profile-specific retrieval or retention semantics are allowed.

All Hold services must preserve the common custody invariants. Custody remains opaque, non-authoritative, selector-driven, and best-effort.

DeferredDeliveryHold and CacheReplicaHold are named profiles over one custody substrate. DeferredDeliveryHold uses retrieve-once semantics and re-deposit on miss. CacheReplicaHold uses retention-window-governed replica semantics.

Applications that need durable truth must use journals, consensus, or another authoritative replicated state path. Hold does not provide durable custody.

3. Protocol-Specific Guarantees

3.1 DKG and Resharing

Distributed key generation and resharing satisfy:

  • Threshold bounds: 1 ≤ t ≤ n where t is threshold and n is participant count
  • Phase consistency: Commitment counts match protocol phase
  • Share timing: Shares distributed only after commitment verification

3.2 Invitation Flows

Invitation lifecycle satisfies authorization invariants:

  • Sender authority: Only sender can cancel an invitation
  • Receiver authority: Only receiver can accept or decline
  • Single resolution: No invitation resolved twice
  • Terminal immutability: Terminal status (accepted/declined/cancelled/expired) is permanent
  • Fact backing: Accepted invitations have corresponding journal facts
  • Ceremony gating: Ceremonies only initiated for accepted invitations

3.3 Epoch Validity

Epochs enforce temporal boundaries:

  • Receipt validity window: Receipts only valid within their epoch
  • Replay prevention: Old epoch receipts cannot be replayed in new epochs

3.4 Cross-Protocol Safety

Concurrent protocol execution (e.g., Recovery∥Consensus) satisfies:

  • No deadlock: Interleaved execution always makes progress
  • Revocation enforcement: Revoked devices excluded from all protocols

4. Liveness Guarantees

4.1 Fast-Path Consensus

Fast-path consensus completes within bounded delay when the fast-path witness and network assumptions hold.

4.2 Fallback Consensus

Fallback consensus eventually completes under partial synchrony when the fallback quorum and delivery assumptions hold.

4.3 Anti-Entropy

Journals converge under eventual delivery. CRDT merges reconcile fact sets even after partitions.

4.4 Rendezvous

Offer and answer envelopes flood gossip neighborhoods. Secure channels can be established as long as at least one bidirectional path remains between parties. See Rendezvous Architecture.

4.5 Flow Budgets

Flow-budget progress is conditional. If future epoch state restores positive headroom and epoch updates converge, local budget enforcement eventually grants headroom. Budget exhaustion remains temporary only under these assumptions.

Liveness requires that each authority eventually receives messages from its immediate neighbors. This is the eventual delivery assumption. Liveness also requires that clocks do not drift unboundedly. This is necessary for epoch rotation and receipt expiry.

4.6 Hold Availability

Hold availability is neighborhood-scoped and selector-driven. It is not a guarantee that any specific holder remains available.

Liveness for Hold requires that the runtime can find some admissible holder within the neighborhood-scoped provider set. Retrieval miss and re-deposit are expected recovery behaviors. They are not contract violations by themselves.

The runtime may choose a bounded rotating subset of holders inside the wider neighborhood-scoped provider set. This bounded operational set does not narrow the interface scope. Retention treatment must remain uniform across neighborhood deposits.

5. Time System

Aura uses a unified TimeStamp with domain-specific comparison:

  • Reflexivity: compare(policy, t, t) = eq
  • Transitivity: compare(policy, a, b) = lt ∧ compare(policy, b, c) = lt → compare(policy, a, c) = lt
  • Privacy: Physical time hidden when ignorePhysical = true

Time variants include PhysicalClock (wall time), LogicalClock (vector/Lamport), OrderClock (opaque ordering tokens), and Range (validity windows).

6. Synchrony and Timing Model

Aura assumes partial synchrony. There exists a bound Δ_net on message delay and processing time once the network stabilizes. This bound may be unknown before stabilization occurs.

Before stabilization, progress may stall. After stabilization, protocols that depend on eventual delivery and bounded delay may resume progress.

Epoch rotation relies on loosely synchronized clocks. The journal remains the source of truth for observed epoch state.

7. Adversarial Model

7.1 Network Adversary

A network adversary may delay or drop traffic and may control a subset of links.

  • Must not: break cryptography or forge valid accountability material without the required local authorization and signatures
  • Contract boundary: safety holds only within the declared fault bounds for the active ceremony

7.2 Byzantine Witness

A Byzantine witness may equivocate or refuse to participate.

  • Must not: cause multiple commits for the same consensus instance while the declared fault assumptions hold
  • Contract boundary: equivocation is detectable and excluded from valid threshold outcomes

7.3 Malicious Relay

A malicious relay may drop, delay, or refuse to forward envelopes.

  • Must not: read protected payload content or forge valid forwarding accountability
  • Contract boundary: relay failure may reduce liveness but must not violate transport safety or accountability rules

7.4 Malicious Hold Provider

A malicious hold provider may evict, refuse retrieval, or under-serve after accepting custody.

  • Must not: turn custody objects into authoritative state or obtain legitimate service credit without successful witness verification
  • Contract boundary: custody failure affects availability, not authoritative truth

7.5 Device Compromise

A compromised device may reveal its local share and journal copy.

  • Must not: satisfy threshold requirements on its own or rewrite authoritative history unilaterally
  • Contract boundary: device compromise is recoverable if the remaining threshold and recovery assumptions still hold

8. Consistency Model

SurfaceRequired GuaranteeExplicit Non-guarantee
JournalReplicas that observe the same fact set converge to the same stateImmediate global agreement
Consensus-scoped operationHonest replicas agree on the committed result for that operationA single global linearizable log
TransportTransport safety does not depend on causal delivery orderingGlobal transport-level causal order
Local replica viewLocally observed facts remain monotoneInstant reflection of all remote facts

9. Failure Handling

Failure classes are distinct:

  • authoritative state failure
  • custody failure
  • runtime-local cache or selection failure

Allowed failure:

  • temporary loss of progress during partition or instability
  • custody miss, eviction, or re-deposit
  • runtime-local cache invalidation or reselection

Forbidden failure:

  • divergent authoritative truth for the same committed operation
  • treating custody state as authoritative replicated truth
  • exposing internal failure causes through remote error detail

Local-only failure:

  • authorization failure
  • budget exhaustion
  • runtime-local cache or selection failure

9.1 Error-Channel Privacy Requirements

  • Runtime errors must use bounded enums and redacted payloads.
  • Error paths must not include plaintext message content, raw capability tokens, or cross-context identifiers.
  • Remote peers may observe protocol-level status outcomes only, not internal guard-stage diagnostics.

10. References

System Architecture describes runtime layering.

Authorization describes authorization and budgeting ordering.

Theoretical Model covers the formal calculus and semilattice laws.

Authority and Identity documents reduction ordering.

Journal and Distributed Maintenance Architecture cover fact storage and convergence.

Relational Contexts documents cross-authority state.

Consensus describes fast path and fallback consensus.

Transport and Information Flow documents transport semantics.

Authorization covers authorization and budgeting sequencing.

Verification Coverage Report tracks formal verification status.

Cryptography

This document describes the cryptographic architecture in Aura. It defines layer responsibilities, code organization patterns, security invariants, and compliance requirements for cryptographic operations.

1. Overview

Aura's cryptographic architecture follows the 8-layer system design with strict separation of concerns.

  • Layer 1 (aura-core): Type wrappers, trait definitions, pure functions
  • Layer 3 (aura-effects): Production implementations with real crypto libraries
  • Layer 8 (aura-testkit): Mock implementations for deterministic testing

This separation ensures that cryptographic operations are auditable, testable, and maintainable. Security review focuses on a small number of files rather than scattered usage throughout the codebase.

2. Layer Responsibilities

2.1 Layer 1: aura-core

The aura-core crate provides cryptographic foundations without direct side effects.

Type Wrappers

Type wrappers live in crates/aura-core/src/crypto/ed25519.rs.

#![allow(unused)]
fn main() {
pub struct Ed25519SigningKey(pub [u8; 32]);
pub struct Ed25519VerifyingKey(pub [u8; 32]);
pub struct Ed25519Signature(pub [u8; 64]);
}

These wrappers use fixed-size arrays for type safety and delegate to ed25519_dalek internally. They expose a stable API independent of the underlying library. They enable future algorithm migration without changing application code. They provide type safety across crate boundaries.

Effect Traits

Effect trait definitions live in crates/aura-core/src/effects/. The CryptoCoreEffects trait inherits from RandomCoreEffects and provides core cryptographic operations.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait CryptoCoreEffects: RandomCoreEffects + Send + Sync {
    // Key derivation
    async fn kdf_derive(&self, ikm: &[u8], salt: &[u8], info: &[u8], output_len: u32) -> Result<Vec<u8>, CryptoError>;
    async fn derive_key(&self, master_key: &[u8], context: &KeyDerivationContext) -> Result<Vec<u8>, CryptoError>;

    // Ed25519 signatures
    async fn ed25519_generate_keypair(&self) -> Result<(Vec<u8>, Vec<u8>), CryptoError>;
    async fn ed25519_sign(&self, message: &[u8], private_key: &[u8]) -> Result<Vec<u8>, CryptoError>;
    async fn ed25519_verify(&self, message: &[u8], signature: &[u8], public_key: &[u8]) -> Result<bool, CryptoError>;

    // Utility methods
    fn is_simulated(&self) -> bool;
    fn crypto_capabilities(&self) -> Vec<String>;
    fn constant_time_eq(&self, a: &[u8], b: &[u8]) -> bool;
    fn secure_zero(&self, data: &mut [u8]);
}
}

The CryptoExtendedEffects trait provides additional operations with default implementations that return errors:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait CryptoExtendedEffects: CryptoCoreEffects + Send + Sync {
    // Unified signing API
    async fn generate_signing_keys(&self, threshold: u16, max_signers: u16) -> Result<SigningKeyGenResult, CryptoError>;
    async fn generate_signing_keys_with(&self, method: KeyGenerationMethod, threshold: u16, max_signers: u16) -> Result<SigningKeyGenResult, CryptoError>;
    async fn sign_with_key(&self, message: &[u8], key_package: &[u8], mode: SigningMode) -> Result<Vec<u8>, CryptoError>;
    async fn verify_signature(&self, message: &[u8], signature: &[u8], public_key_package: &[u8], mode: SigningMode) -> Result<bool, CryptoError>;

    // FROST threshold signatures
    async fn frost_generate_keys(&self, threshold: u16, max_signers: u16) -> Result<FrostKeyGenResult, CryptoError>;
    async fn frost_generate_nonces(&self, key_package: &[u8]) -> Result<Vec<u8>, CryptoError>;
    async fn frost_create_signing_package(&self, message: &[u8], nonces: &[Vec<u8>], participants: &[u16], public_key_package: &[u8]) -> Result<FrostSigningPackage, CryptoError>;
    async fn frost_sign_share(&self, signing_package: &FrostSigningPackage, key_share: &[u8], nonces: &[u8]) -> Result<Vec<u8>, CryptoError>;
    async fn frost_aggregate_signatures(&self, signing_package: &FrostSigningPackage, signature_shares: &[Vec<u8>]) -> Result<Vec<u8>, CryptoError>;
    async fn frost_verify(&self, message: &[u8], signature: &[u8], group_public_key: &[u8]) -> Result<bool, CryptoError>;
    async fn ed25519_public_key(&self, private_key: &[u8]) -> Result<Vec<u8>, CryptoError>;

    // Symmetric encryption
    async fn chacha20_encrypt(&self, plaintext: &[u8], key: &[u8; 32], nonce: &[u8; 12]) -> Result<Vec<u8>, CryptoError>;
    async fn chacha20_decrypt(&self, ciphertext: &[u8], key: &[u8; 32], nonce: &[u8; 12]) -> Result<Vec<u8>, CryptoError>;
    async fn aes_gcm_encrypt(&self, plaintext: &[u8], key: &[u8; 32], nonce: &[u8; 12]) -> Result<Vec<u8>, CryptoError>;
    async fn aes_gcm_decrypt(&self, ciphertext: &[u8], key: &[u8; 32], nonce: &[u8; 12]) -> Result<Vec<u8>, CryptoError>;

    // Key rotation and conversion
    async fn frost_rotate_keys(&self, old_shares: &[Vec<u8>], old_threshold: u16, new_threshold: u16, new_max_signers: u16) -> Result<FrostKeyGenResult, CryptoError>;
    async fn convert_ed25519_to_x25519_public(&self, ed25519_public_key: &[u8]) -> Result<[u8; 32], CryptoError>;
    async fn convert_ed25519_to_x25519_private(&self, ed25519_private_key: &[u8]) -> Result<[u8; 32], CryptoError>;
}

pub trait CryptoEffects: CryptoCoreEffects + CryptoExtendedEffects {}
}

The core trait provides key derivation and Ed25519 signatures. The extended trait provides unified signing that routes between single-signer and threshold modes, FROST threshold operations, symmetric encryption, and key conversion. Hashing is not included because it is a pure operation. Use aura_core::hash::hash() for synchronous hashing instead.

The RandomCoreEffects trait provides cryptographically secure random number generation.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait RandomCoreEffects: Send + Sync {
    async fn random_bytes(&self, len: usize) -> Vec<u8>;
    async fn random_bytes_32(&self) -> [u8; 32];
    async fn random_u64(&self) -> u64;
}

#[async_trait]
pub trait RandomExtendedEffects: RandomCoreEffects + Send + Sync {
    async fn random_range(&self, min: u64, max: u64) -> u64;
    async fn random_uuid(&self) -> Uuid;
}
}

The core trait provides basic random generation. The extended trait adds range and UUID generation with default implementations. All randomness flows through these traits for testability and simulation.

Pure functions in crates/aura-core/src/crypto/ implement hash functions, signature verification, and other deterministic operations. These require no side effects and can be called directly.

2.2 Layer 3: aura-effects

The aura-effects crate contains the only production implementations that directly use cryptographic libraries.

The production handler lives in crates/aura-effects/src/crypto.rs. RealCryptoHandler can operate with OS entropy (production) or with a seed (deterministic testing). It implements all methods from CryptoCoreEffects and RandomCoreEffects.

The following direct imports are allowed in Layer 3:

  • ed25519_dalek
  • frost_ed25519
  • chacha20poly1305
  • aes_gcm
  • getrandom
  • rand_core::OsRng
  • rand_chacha
  • blake3

2.3 Threshold Lifecycle (K1/K2/K3) and Transcript Binding

Aura separates key generation from agreement/finality:

  • K1: Single-signer (Ed25519). No DKG required.
  • K2: Dealer-based DKG. A trusted coordinator produces dealer packages.
  • K3: Consensus-finalized DKG. The BFT-DKG transcript is finalized by consensus.

Transcript hashing uses the following rules:

  • All DKG transcripts are hashed using canonical DAG‑CBOR encoding.
  • DkgTranscriptCommit binds transcript_hash, prestate_hash, and operation_hash.

Dealer packages (K2) follow these rules:

  • Deterministic dealer packages are acceptable in trusted settings.
  • Dealer packages must include encrypted shares for every participant.

BFT‑DKG (K3) follows these rules:

  • A transcript is only usable once consensus finalizes the commit fact.
  • All K3 ceremonies must reference the finalized transcript (hash or blob ref).

2.4 Layer 8: aura-testkit

The aura-testkit crate provides mock implementations for deterministic testing.

The mock handler lives in crates/aura-testkit/src/stateful_effects/crypto.rs. MockCryptoHandler uses a seed and counter for deterministic behavior, enabling reproducible test results, simulation of edge cases, and faster test execution.

3. Usage Boundary

Application code accesses cryptographic operations exclusively through effect traits. Direct imports of cryptographic libraries are forbidden outside Layer 3 handlers. Randomness flows through RandomCoreEffects for testability and simulation. See Effects and Handlers Guide for usage patterns and anti-patterns.

4. Allowed Locations

Direct cryptographic library usage is restricted to the following locations.

LocationAllowed LibrariesPurpose
aura-core/src/crypto/*ed25519_dalek, frost_ed25519Type wrappers
aura-core/src/types/authority.rsed25519_dalekAuthority trait types
aura-effects/src/*All crypto libsProduction handlers
aura-effects/src/noise.rssnowNoise Protocol implementation
aura-testkit/*All crypto libsTest infrastructure
**/tests/*, *_test.rsOsRngTest-only randomness
#[cfg(test)] modulesOsRngTest-only randomness

5. Security Invariants

The cryptographic architecture maintains these invariants.

  1. All production crypto operations flow through RealCryptoHandler
  2. Security review focuses on Layer 3 handlers, not scattered usage
  3. All crypto is controllable via mock handlers for testing
  4. Private keys remain in wrapper types, not exposed as raw bytes
  5. Production randomness comes from OS entropy via OsRng

6. Signing Modes

Aura supports two signing modes to handle different account configurations.

6.1 SigningMode Enum

#![allow(unused)]
fn main() {
pub enum SigningMode {
    SingleSigner,  // Standard Ed25519 for 1-of-1
    Threshold,     // FROST for m-of-n where m >= 2
}
}

The SingleSigner mode is used for new user onboarding with single device accounts. It is also used for bootstrap scenarios before multi-device setup and for simple personal accounts that do not need threshold security.

The Threshold mode is used for multi-device accounts such as 2-of-3 or 3-of-5 configurations. It is also used for guardian-protected accounts and group decisions requiring multiple approvals.

6.2 Why Two Modes?

FROST requires at least 2 signers. For 1-of-1 configurations, we use standard Ed25519.

Ed25519 uses the same curve as FROST and produces compatible signatures for verification. Single signatures do not require nonce coordination or aggregation.

6.3 API Usage

The unified signing API selects between SingleSigner and Threshold modes based on the threshold parameter. For implementation patterns, see Effects and Handlers Guide.

6.4 Storage Separation

Single-signer and threshold keys use separate storage paths managed by SecureStorageEffects. Path conventions are documented in Effects and Handlers Guide.

7. FROST and Threshold Signatures

Aura provides a unified threshold signing architecture for all scenarios requiring m-of-n signatures where m >= 2.

7.1 Architecture Layers

The trait definition lives in aura-core/src/effects/threshold.rs.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait ThresholdSigningEffects: Send + Sync {
    async fn bootstrap_authority(&self, authority: &AuthorityId) -> Result<PublicKeyPackage, ThresholdSigningError>;
    async fn sign(&self, context: SigningContext) -> Result<ThresholdSignature, ThresholdSigningError>;
    async fn threshold_config(&self, authority: &AuthorityId) -> Option<ThresholdConfig>;
    async fn threshold_state(&self, authority: &AuthorityId) -> Option<ThresholdState>;
    async fn has_signing_capability(&self, authority: &AuthorityId) -> bool;
    async fn public_key_package(&self, authority: &AuthorityId) -> Option<PublicKeyPackage>;
    async fn rotate_keys(&self, authority: &AuthorityId, new_threshold: u16, new_total_participants: u16, participants: &[ParticipantIdentity]) -> Result<(u64, Vec<Vec<u8>>, PublicKeyPackage), ThresholdSigningError>;
    async fn commit_key_rotation(&self, authority: &AuthorityId, new_epoch: u64) -> Result<(), ThresholdSigningError>;
    async fn rollback_key_rotation(&self, authority: &AuthorityId, failed_epoch: u64) -> Result<(), ThresholdSigningError>;
}
}

The trait provides methods for bootstrapping authorities, signing operations, querying configurations and state, checking capabilities, and key rotation lifecycle management.

Context types live in aura-core/src/threshold/context.rs.

#![allow(unused)]
fn main() {
pub struct SigningContext {
    pub authority: AuthorityId,
    pub operation: SignableOperation,
    pub approval_context: ApprovalContext,
}

pub enum SignableOperation {
    TreeOp(TreeOp),
    RecoveryApproval { target: AuthorityId, new_root: TreeCommitment },
    GroupProposal { group: AuthorityId, action: GroupAction },
    Message { domain: String, payload: Vec<u8> },
    OTAActivation { ceremony_id: [u8; 32], upgrade_hash: [u8; 32], prestate_hash: [u8; 32], activation_epoch: Epoch, ready: bool },
}

pub enum ApprovalContext {
    SelfOperation,
    RecoveryAssistance { recovering: AuthorityId, session_id: String },
    GroupDecision { group: AuthorityId, proposal_id: String },
    ElevatedOperation { operation_type: String, value_context: Option<String> },
}
}

The SignableOperation enum defines what is being signed. Its OTA activation variant should be interpreted as scoped activation approval evidence. activation_epoch is meaningful only when the chosen scope actually owns an epoch fence. The ApprovalContext enum provides context for audit and display purposes.

The service implementation lives in aura-agent/src/runtime/services/threshold_signing.rs. ThresholdSigningService manages per-authority signing state and key storage using SecureStorageEffects for key material persistence.

Low-level primitives live in aura-core/src/crypto/tree_signing.rs. This module defines FROST types and pure coordination logic. It re-exports frost_ed25519 types for type safety.

The handler in aura-effects/src/crypto.rs implements FROST key generation and signing. This is the only location with direct frost_ed25519 library calls.

7.2 Serialized Size Invariants (FROST)

Aura treats the postcard serialization of FROST round-one data as canonical and fixed-size. This prevents malleability and makes invalid encodings unrepresentable at the boundary.

  • SigningNonces (secret) must serialize to exactly 138 bytes
  • SigningCommitments (public) must serialize to exactly 69 bytes

These sizes are enforced in aura-core/src/crypto/tree_signing.rs and mirrored in aura-core/src/constants.rs. See Distributed Maintenance Guide for update procedures when upstream encodings change.

7.3 Lifecycle Taxonomy (Key Generation vs Agreement)

Aura separates key generation from agreement/finality:

  • K1: Local/Single-Signer (no DKG)
  • K2: Dealer-Based DKG (trusted coordinator)
  • K3: Quorum/BFT-DKG (consensus-finalized transcript)

Agreement modes are orthogonal:

  • A1: Provisional (usable immediately, not final)
  • A2: Coordinator Soft-Safe (bounded divergence + convergence cert)
  • A3: Consensus-Finalized (unique, durable, non-forkable)

Leader selection (lottery/round seed/fixed coordinator) and pipelining are orthogonal optimizations, not agreement modes.

7.4 Usage Pattern

High-level signing operations use AppCore or direct ThresholdSigningEffects trait calls. See Effects and Handlers Guide for recommended patterns.

7.6 FROST Minimum Threshold

FROST requires threshold >= 2. Calling frost_generate_keys(1, 1) returns an error. For single-signer scenarios, use generate_signing_keys(1, 1) which routes to Ed25519 automatically.

8. Extensibility

The wrapper and trait abstraction enables algorithm migration and HSM integration without changing application code. Migration procedures are documented in Distributed Maintenance Guide.

See Also

Identifiers and Boundaries

This reference defines the identifiers that appear in Aura documents. Every other document should reuse these definitions instead of restating partial variants. Each identifier preserves structural privacy by design.

1. Authority Identifiers

IdentifierTypePurpose
AuthorityIdUuidJournal namespace for an authority. Does not leak operator or membership metadata. All public keys, commitment trees, and attested operations reduce under this namespace. Guardians are identified by their own AuthorityId.
DeviceIdUuidDevice within a threshold account. Each device holds shares of the root key. Visible only inside the authority namespace.
AccountIdUuidLegacy identifier being replaced by AuthorityId. Exists for backward compatibility.

2. Context Identifiers

IdentifierTypePurpose
ContextIdUuidRelational context or derived subcontext. Opaque on the wire, appears only inside encrypted envelopes and receipts. Never encodes participant lists or roles. Flow budgets and leakage metrics scope to (ContextId, peer) pairs.
SessionIdUuidChoreographic protocol execution instance. Pairs a ContextId with a nonce. Not long-lived. Expires when protocol completes or times out.
DkdContextId{ app_label: String, fingerprint: [u8; 32] }Deterministic Key Derivation context. Combines application label with fingerprint to scope derived keys across application boundaries.

3. Communication Identifiers

IdentifierTypePurpose
ChannelIdHash32AMP messaging substream scoped under a relational context. Opaque. Does not reveal membership or topology.
MessageIdUuidIndividual message within a channel. Scoped under a ChannelId.
RelayId[u8; 32]Pairwise communication context derived from X25519 keys. Foundation for RID message contexts.
GroupId[u8; 32]Threshold group communication context derived from group membership. Foundation for GID message contexts.
MessageContextenum { Relay, Group, DkdContext }Unifies the three privacy context types. Enforces mutual exclusivity. Cross-partition routing requires explicit bridge operations.
ConnectionIdUuidNetwork connection identifier with privacy-preserving properties. Does not encode endpoint information.

4. Content Identifiers

IdentifierTypePurpose
ContentId{ hash: Hash32, size: Option<u64> }Hash of canonical content bytes (files, documents, encrypted payloads, CRDT state). Any party can verify integrity by hashing and comparing.
ChunkId{ hash: Hash32, sequence: Option<u32> }Storage-layer chunk identifier. Multiple chunks may comprise a single ContentId. Enables content-addressable storage with deduplication.
Hash32[u8; 32]Raw 32-byte BLAKE3 cryptographic hash. Foundation for all content addressing. Provides collision and preimage resistance.
DataIdStringStored data chunk identifier with type prefixes (data:uuid, encrypted:uuid). Enables heterogeneous storage addressing.

5. Journal Identifiers

IdentifierTypePurpose
FactId[u8; 32]Content-addressed fact identifier derived from fact hash. Enables deduplication and integrity verification.
FactTypeIdStringRegistered fact type discriminator. Each domain crate declares its own type IDs for schema-aware encoding and reduction.
FactOpId[u8; 32]Attested operation identifier within the journal.
EventIdUuidEvent identifier within the effect API system. Used in audit logs and debugging.
OperationIdUuidOperation tracking identifier.

6. Consensus Identifiers

IdentifierTypePurpose
ConsensusIdHash32Consensus instance identifier derived from prestate hash, operation hash, and nonce. Binds operations to prestates through hash commitment. See Consensus.
FrostParticipantIdNonZeroU16Threshold signing participant. Must be non-zero for FROST protocol compatibility. Scoped to signing sessions.

7. Social Topology Identifiers

IdentifierTypePurpose
HomeIdHash32Home in the social topology. Each user resides in exactly one home. See Social Architecture.
NeighborhoodIdHash32Neighborhood (collection of homes with 1-hop link relationships). Enables governance and traversal policies.

8. Tree Identifiers

IdentifierTypePurpose
LeafIdu32Authority-internal leaf node in the commitment tree. Stable across tree modifications and epoch rotations. Used for internal topology, proofs, and key-rotation bookkeeping. See Authority and Identity.
ProposalIdHash32Snapshot proposal identifier. Enables proposal deduplication and verification during tree operations.

9. Ceremony and Recovery Identifiers

IdentifierTypePurpose
CeremonyIdStringGuardian ceremony identifier for key-rotation and enrollment flows.
RecoveryIdStringRecovery ceremony identifier for guardian-based recovery flows.
InvitationIdStringInvitation identifier for relationship and enrollment invites.

10. Membership Identifiers

IdentifierTypePurpose
MemberIdStringMember identifier within groups or organizational structures.
IndividualIdStringIndividual person or entity within the identity system. Can be derived from a DeviceId or DKD context.
RelationshipId[u8; 32]Identifies a specific relationship between two authorities.

11. Accountability Structures

Receipt

Receipt is the accountability record emitted by FlowGuard. It contains context, source authority, destination authority, epoch, cost, nonce, chained hash, and signature. Receipts prove that upstream participants charged their budget before forwarding. No receipt includes device identifiers or user handles.

Fields: ctx: ContextId, src: AuthorityId, dst: AuthorityId, epoch: Epoch, cost: FlowCost, nonce: FlowNonce, prev: Hash32, sig: ReceiptSig.

See Transport and Information Flow for receipt propagation.

12. Derived Keys

Aura derives per-context cryptographic keys from reduced account state and ContextId. Derived keys never surface on the wire. They only exist inside effect handlers to encrypt payloads, generate commitment tree secrets, or run DKD.

The derivation inputs never include device identifiers. Derived keys inherit the privacy guarantees of AuthorityId and ContextId. The derivation function uses derive(account_root, app_id, context_label) and is deterministic but irreversible.

See Also

Authority and Identity describes the authority model in detail. Relational Contexts covers cross-authority relationships. Transport and Information Flow documents receipt chains and flow budgets. Social Architecture defines homes and neighborhoods.

Authority and Identity

This document describes the architecture of authorities and identity in Aura. It defines the authority model, account state machine, commitment tree structure, and relational identity model. Identity emerges through shared contexts rather than as a global property of keys.

1. Authority Model

An authority is a cryptographic actor represented by a public key. An authority hides its internal structure. An authority may contain one or more devices. An authority is the smallest unit that can sign facts or capabilities.

An authority has an internal journal namespace. The journal namespace stores facts relevant to that authority. The authority derives its state from deterministic reduction of that fact set.

Devices are not exclusive to a single authority. A single device may hold threshold shares for multiple authorities at the same time. Joining a new authority adds signing capability for that authority without removing any existing authority memberships.

1.1 Authority Types and Membership Terms

Aura uses three authority types:

  • User: individual authority
  • Home: group authority that accepts User authorities as members
  • Neighborhood: group authority that accepts Home authorities as members

Home membership terminology is:

  • Member: authority in the home's threshold authority set
  • Participant: authority granted access to the home without threshold membership
  • Moderator: optional designation granted to a member for moderation operations

AuthorityId (see Identifiers and Boundaries) selects the journal namespace associated with the authority. The identifier does not encode structure or membership. The authority publishes its current public key and root commitment inside its own journal.

Authorities can interact with other authorities through Relational Contexts. These interactions do not change the authority's internal structure. The authority remains isolated except where relational state is explicitly shared.

2. Account Authorities

An account authority is an authority with long term state. An account maintains device membership through its commitment tree. An account contains its own journal namespace. An account evolves through attested operations stored as facts.

The commitment tree defines device membership and threshold policies. The journal stores facts that represent signed tree operations. The reduction function reconstructs the canonical tree state from the accumulated fact set.

An account authority exposes a single public key derived from the commitment tree root. The authority never exposes device structure. The account state changes only when an attested operation appears in the journal.

Aura supports multiple key generation methods for account authorities. K1 uses single-signer local key generation. K2 uses dealer-based DKG with a trusted coordinator. K3 uses quorum/BFT DKG with consensus-finalized transcripts. These are orthogonal to agreement modes (A1 provisional, A2 coordinator soft-safe, A3 consensus-finalized). Durable shared authority state must be finalized via A3.

#![allow(unused)]
fn main() {
pub trait Authority: Send + Sync {
    fn authority_id(&self) -> AuthorityId;
    fn public_key(&self) -> Ed25519VerifyingKey;
    fn root_commitment(&self) -> Hash32;
    async fn sign_operation(&self, operation: &[u8]) -> Result<Signature>;
    fn get_threshold(&self) -> u16;
    fn active_device_count(&self) -> usize;
}
}

The Authority trait provides the external interface for authority operations. The trait exposes only the public key, root commitment, and signing capabilities. The internal device structure remains hidden from external consumers.

An account authority derives context specific keys using deterministic key derivation. These derived authorities represent application scoped identities. See Effects and Handlers Guide for implementation examples.

3. Account State Machine

The TreeState structure represents the materialized state of an account at a specific epoch.

#![allow(unused)]
fn main() {
pub struct TreeState {
    pub epoch: Epoch,
    pub root_commitment: TreeHash32,
    pub branches: BTreeMap<NodeIndex, BranchNode>,
    pub leaves: BTreeMap<LeafId, LeafNode>,
    leaf_commitments: BTreeMap<LeafId, TreeHash32>,
    tree_topology: TreeTopology,
    branch_signing_keys: BTreeMap<NodeIndex, BranchSigningKey>,
}
}

The epoch field is monotonically increasing. The root_commitment is the hash of the entire tree structure. The branches map stores branch nodes by index. The leaves map stores leaf nodes by ID.

The leaf_commitments map caches leaf commitment hashes. The tree_topology tracks parent-child relationships. The branch_signing_keys map stores FROST group public keys for verification.

For the authority-internal reducer implementation (AuthorityTreeState), topology is explicitly materialized as ordered binary edges with parent pointers. Active leaves are canonically sorted by LeafId, paired in stable order, and assigned deterministic branch indices (NodeIndex(0) root, then breadth-first). This allows path-local commitment recomputation for non-structural updates while preserving deterministic replay semantics.

TreeState is derived state and is never stored directly in the journal. It is computed on-demand from the OpLog via the reduction function.

#![allow(unused)]
fn main() {
pub struct TreeStateSummary {
    epoch: Epoch,
    commitment: Hash32,
    threshold: u16,
    device_count: u32,
}
}

The TreeStateSummary provides a public view with only epoch, commitment, threshold, and device count. This summary hides internal device structure for external consumers while the full TreeState is used for internal operations.

4. Commitment Tree Structure

A commitment tree contains branch nodes and leaf nodes. A leaf node represents a device or guardian inside the account. A branch node represents a subpolicy with threshold requirements. The root node defines the account-level threshold policy.

Node Kinds

#![allow(unused)]
fn main() {
pub enum NodeKind {
    Leaf(LeafNode),
    Branch,
}
}

This type defines leaf and branch variants. The Leaf variant contains a LeafNode with device information. The Branch variant is a marker indicating an internal node.

Leaf and Branch Nodes

#![allow(unused)]
fn main() {
pub struct LeafNode {
    pub leaf_id: LeafId,
    pub device_id: DeviceId,
    pub role: LeafRole,
    pub public_key: LeafPublicKey,
    pub meta: LeafMetadata,
}

pub enum LeafRole {
    Device,
    Guardian,
}
}

The LeafNode structure stores device information required for threshold signing. The leaf_id is a stable identifier across tree modifications. The device_id identifies the device within the authority. The role distinguishes devices from guardians.

#![allow(unused)]
fn main() {
pub struct BranchNode {
    pub node: NodeIndex,
    pub policy: Policy,
    pub commitment: TreeHash32,
}
}

The BranchNode structure stores policy data for internal nodes. The node field is the branch index. The policy defines the threshold requirement. The commitment is the cryptographic hash of the branch structure.

#![allow(unused)]
fn main() {
pub struct BranchSigningKey {
    pub group_public_key: [u8; 32],
    pub key_epoch: Epoch,
}
}

A BranchSigningKey stores the FROST group public key for threshold signing at a branch node. The key_epoch tracks when the key was established via DKG. Signing keys are updated when membership changes under the branch or when policy changes affect the signing group.

5. Tree Topology

The TreeTopology structure tracks parent-child relationships for efficient navigation.

#![allow(unused)]
fn main() {
pub struct TreeTopology {
    parent_pointers: BTreeMap<NodeIndex, NodeIndex>,
    children_pointers: BTreeMap<NodeIndex, BTreeSet<NodeIndex>>,
    leaf_parents: BTreeMap<LeafId, NodeIndex>,
    root_node: Option<NodeIndex>,
}
}

The parent_pointers map links nodes to their parents. The children_pointers map links parents to their children. The leaf_parents map links leaves to their parent branches. The root_node tracks the root of the tree.

This structure enables efficient path-to-root traversal and affected node computation during commitment updates.

graph TD
    R["Root Commitment<br/>(policy)"]
    L1["Branch<br/>(threshold)"]
    L2["Branch<br/>(threshold)"]
    A["Leaf A<br/>(device)"]
    B["Leaf B<br/>(device)"]
    C["Leaf C<br/>(guardian)"]
    D["Leaf D<br/>(device)"]

    R --> L1
    R --> L2
    L1 --> A
    L1 --> B
    L2 --> C
    L2 --> D

Each node commitment is computed over its ordered children plus its policy metadata. The root commitment is used in key derivation and verification. Leaves represent device shares. Branches represent threshold policies.

6. Policies

A branch node contains a threshold policy. A policy describes the number of required signatures for authorization.

#![allow(unused)]
fn main() {
pub enum Policy {
    Any,
    Threshold { m: u16, n: u16 },
    All,
}
}

The Any policy accepts one signature from any device under that branch. The Threshold policy requires m signatures out of n devices. The All policy requires all devices under the branch.

Policies form a meet semilattice where the meet operation selects the stricter of two policies.

#![allow(unused)]
fn main() {
impl Policy {
    pub fn required_signers(&self, child_count: usize) -> Result<u16, PolicyError> {
        match self {
            Policy::Any => Ok(1),
            Policy::All => Ok(child_count as u16),
            Policy::Threshold { m, n } => {
                if child_count as u16 != *n {
                    return Err(PolicyError::ChildCountMismatch { expected: *n, actual: child_count as u16 });
                }
                Ok(*m)
            }
        }
    }
}
}

The required_signers method derives the concrete threshold from the policy given the current child count. It returns an error if the child count does not match the policy's expected total. This is used during signature verification to determine how many signers must have participated.

7. Tree Operations

Tree operations modify the commitment tree. Each operation references a parent epoch and parent commitment. Each operation is signed through threshold signing.

#![allow(unused)]
fn main() {
pub enum TreeOpKind {
    AddLeaf { leaf: LeafNode, under: NodeIndex },
    RemoveLeaf { leaf: LeafId, reason: u8 },
    ChangePolicy { node: NodeIndex, new_policy: Policy },
    RotateEpoch { affected: Vec<NodeIndex> },
}
}

The AddLeaf operation inserts a new leaf under a branch. The RemoveLeaf operation removes an existing leaf with a reason code. The ChangePolicy operation updates the policy of a branch. The RotateEpoch operation increments the epoch for affected nodes and invalidates derived context keys.

#![allow(unused)]
fn main() {
pub struct TreeOp {
    pub parent_epoch: Epoch,
    pub parent_commitment: TreeHash32,
    pub op: TreeOpKind,
    pub version: u16,
}
}

The TreeOp structure binds an operation to its parent state. The parent_epoch and parent_commitment prevent replay attacks. The version field enables protocol upgrades.

#![allow(unused)]
fn main() {
pub struct AttestedOp {
    pub op: TreeOp,
    pub agg_sig: Vec<u8>,
    pub signer_count: u16,
}
}

The agg_sig field stores the FROST aggregate signature. The signer_count records how many devices contributed. The signature validates under the parent root commitment. Devices refuse to sign if the local tree state does not match.

8. Tree Operation Verification

Tree operations use a two-phase verification model that separates cryptographic verification from state consistency checking.

8.1 Cryptographic Verification

The verify_attested_op function performs cryptographic signature checking only.

#![allow(unused)]
fn main() {
pub fn verify_attested_op(
    attested: &AttestedOp,
    signing_key: &BranchSigningKey,
    threshold: u16,
    current_epoch: Epoch,
) -> Result<(), VerificationError>;
}

Verification checks that the signer count meets the threshold requirement. It computes the binding message including the group public key. It verifies the FROST aggregate signature. Verification is self-contained and can be performed offline.

8.2 State Consistency Check

The check_attested_op function performs full verification plus TreeState consistency.

#![allow(unused)]
fn main() {
pub fn check_attested_op<S: TreeStateView>(
    state: &S,
    attested: &AttestedOp,
    target_node: NodeIndex,
) -> Result<(), CheckError>;
}

Check verifies the operation cryptographically. It ensures the signing key exists for the target node. It validates that the operation epoch and parent commitment match state.

8.3 TreeStateView Trait

#![allow(unused)]
fn main() {
pub trait TreeStateView {
    fn get_signing_key(&self, node: NodeIndex) -> Option<&BranchSigningKey>;
    fn get_policy(&self, node: NodeIndex) -> Option<&Policy>;
    fn child_count(&self, node: NodeIndex) -> usize;
    fn current_epoch(&self) -> Epoch;
    fn current_commitment(&self) -> TreeHash32;
}
}

This trait abstracts over TreeState for verification. It enables verification without a direct dependency on the journal crate.

8.4 Binding Message Security

#![allow(unused)]
fn main() {
pub fn compute_binding_message(
    attested: &AttestedOp,
    current_epoch: Epoch,
    group_public_key: &[u8; 32],
) -> Vec<u8>;
}

The binding message contains a domain separator, parent epoch and commitment for replay prevention, protocol version, current epoch, group public key, and serialized operation content. Including the group public key ensures signatures are bound to a specific signing group.

8.5 Error Types

Verification produces two error categories. VerificationError covers cryptographic issues: missing signing keys, insufficient signers, signature failures, epoch mismatches, and parent commitment mismatches. CheckError covers state consistency issues: verification failures, key epoch mismatches, and missing nodes or policies.

9. Reduction and Conflict Resolution

The account journal is a join semilattice. It stores AttestedOp facts. All replicas merge fact sets using set union. The commitment tree state is recovered using deterministic reduction.

Reduction applies the following rules. Group operations by parent state using ParentKey (epoch + commitment tuple). Select a single winner using a deterministic ordering based on operation hash. Discard superseded operations. Apply winners in parent epoch order.

Conflicts arise when multiple operations reference the same parent epoch and commitment. The reduction algorithm resolves conflicts using a total order on operations. The winning operation applies. Losing operations are ignored.

The reduction algorithm builds a DAG from operations, performs topological sort with tie-breaking, and applies operations sequentially to build the final TreeState.

Conflict resolution uses operation hash as the tie-breaker. When multiple operations share the same parent, they are sorted by hash and the maximum hash wins. This ensures deterministic winner selection across all replicas.

graph TD
    OpLog["OpLog<br/>(CRDT OR-set of AttestedOp)"]
    Verify["verify_attested_op()"]
    Check["check_attested_op()"]
    Reduce["reduce()"]
    TreeState["TreeState<br/>(derived on-demand)"]

    OpLog -->|cryptographic| Verify
    Verify -->|state consistency| Check
    Check -->|valid ops| Reduce
    Reduce -->|derives| TreeState

This diagram shows the data flow from OpLog to TreeState. Operations are verified cryptographically first. Then they are checked for state consistency. Valid operations are reduced to produce the materialized TreeState.

10. Epochs and Derived Keys

The epoch is an integer stored in the tree state that scopes deterministic key derivation. Derived keys depend on the current epoch. Rotation invalidates previous derived keys. The RotateEpoch operation updates the epoch for selected subtrees.

Epochs also scope flow budgets and context presence tickets. All context identities must refresh when the epoch changes.

Derived context keys bind relationship data to the account state. The deterministic key derivation function uses the commitment tree root commitment and epoch. This ensures that all devices compute the same context keys. Derived keys do not modify the tree state.

11. Operators and Devices

An operator controls an authority by operating its devices. An operator is not represented in the protocol. Devices are internal to the authority and hold share material required for signing.

Devices produce partial signatures during threshold signing. The operator coordinates these partial signatures to produce the final signature.

The commitment tree manages device membership. The AddLeaf and RemoveLeaf operations modify device presence in the authority. Device identifiers do not appear outside the authority. No external party can link devices to authorities.

When a device is enrolled or replaced, Aura performs session delegation alongside tree updates. Runtime delegation transfers active protocol session ownership to the receiving device authority and records a SessionDelegation fact for auditability. The commitment tree update (AddLeaf/RemoveLeaf) remains the source of truth for membership, while delegation preserves in-flight protocol continuity during migration.

12. Relational Identity Model

Aura defines identity as contextual and relational. Identity exists only inside a specific relationship and does not exist globally. Authorities represent cryptographic actors rather than people. Identity emerges when two authorities form a shared context.

A shared context exists inside a relational context. A relational context stores relational facts that define how two authorities relate. Profile data may appear in a relational context if both authorities choose to share it. This profile data is scoped to that context.

ContextId (see Identifiers and Boundaries) identifies a relational context. It does not encode membership. It does not reveal which authorities participate. The context stores only the relational facts required by the participants.

Identity inside a context may include nickname suggestions or other profile attributes. These values are private to that context. See Identifiers and Boundaries for context isolation mechanisms. Nicknames (local mappings) allow a device to associate multiple authorities with a single local contact.

13. Authority Relationships

Authorities interact through relational contexts to create shared state. Relational contexts do not modify authority structure. Each relational context has its own journal. Facts in the relational context reference commitments of participating authorities.

Authorities may form long lived or ephemeral relationships. These relationships do not affect global identity. The authority model ensures that each relationship remains isolated.

graph TD
    A[Authority A] --> C(Context)
    B[Authority B] --> C(Context)

This diagram shows two authorities interacting through a relational context. The context holds the relational facts that define the relationship. Neither authority exposes its internal structure to the other.

14. Privacy and Isolation

Authorities reveal no internal structure and contexts do not reveal participants. Identity exists only where authorities choose to share information. Nicknames remain local to devices. There is no global identifier for people or devices.

Every relationship is private to its participants. Each relationship forms its own identity layer. Authorities can operate in many contexts without linking those contexts together.

15. Interaction with Consensus

Consensus is used when a tree operation must have strong agreement across a committee. Consensus produces a commit fact containing a threshold signature. This fact becomes an attested operation in the journal.

Consensus is used when multiple devices must agree on the same prestate. Simple device-initiated changes may use local threshold signing. The account journal treats both cases identically.

Consensus references the root commitment and epoch of the account. This binds the commit fact to the current state.

16. Security Properties

The commitment tree provides fork resistance. Devices refuse to sign under mismatched parent commitments. The reduction function ensures that all replicas converge. Structural opacity hides device membership from external parties.

The threshold signature scheme prevents unauthorized updates. All operations must be signed by the required number of devices. An attacker cannot forge signatures or bypass policies.

The tree design ensures that no external party can identify device structure. The only visible values are the epoch and the root commitment.

17. Implementation Architecture

17.1 Critical Invariants

The implementation enforces these rules.

  1. TreeState is never stored in the journal. It is always derived on-demand via reduction.
  2. OpLog is the only persisted tree data. All tree state can be recovered from the operation log.
  3. Reduction is deterministic across all replicas. The same OpLog always produces the same TreeState.
  4. DeviceId is authority-internal only. It is never exposed in public APIs.

17.2 Data Flow

graph TD
    A["Tree Operation Initiated"]
    B["TreeOperationProcessor validates"]
    C["Convert to AttestedOp fact"]
    D["Journal stores fact"]
    E["verify_attested_op (crypto)"]
    F["check_attested_op (state)"]
    G["reduce processes valid facts"]
    H["apply_operation builds TreeState"]
    I["TreeState materialized"]

    A --> B
    B --> C
    C --> D
    D --> E
    E --> F
    F --> G
    G --> H
    H --> I

This diagram shows the complete lifecycle of a tree operation from initiation to materialization.

17.3 Key Module Locations

Implementation files are in aura-journal/src/commitment_tree/.

  • state.rs: TreeState structure and TreeStateView implementation
  • reduction.rs: Deterministic reduction algorithm
  • operations.rs: TreeOperationProcessor and TreeStateQuery
  • application.rs: Operation application to TreeState
  • compaction.rs: Garbage collection of superseded operations
  • attested_ops.rs: AttestedOp fact handling

Verification code is in aura-core/src/tree/verification.rs. Type definitions are in aura-core/src/tree/types.rs and aura-core/src/tree/policy.rs.

See Also

Effect System

Overview

Aura uses algebraic effects to abstract system capabilities. Effect traits define abstract interfaces for cryptography, storage, networking, time, and randomness. Handlers implement these traits with concrete behavior. Context propagation ensures consistent execution across async boundaries.

This document covers effect trait design, handler patterns, and the context model. See Runtime for lifecycle management, service composition, and guard chain execution. See Ownership Model for the repo-wide Pure/MoveOwned/ActorOwned/Observed taxonomy.

The aura-agent runtime uses structured concurrency with explicit session ownership. Session-bound effects execute only under the current owner via canonical ingress. For complete details on async ownership, session ownership, typed runtime errors, and instrumentation policy, see crates/aura-agent/ARCHITECTURE.md.

The runtime contract is intentionally split:

  • actor services supervise long-lived runtime structure
  • move semantics govern session and endpoint ownership

Effect execution that touches session state belongs to the second category, not the first.

Ownership At Effect Boundaries

Effect traits sit at an ownership boundary and should preserve the repo-wide ownership model rather than hide it.

  • Effect trait definitions in aura-core are primarily Pure.
  • Long-lived mutable async ownership belongs to ActorOwned runtime services, not to effect trait definitions or ad hoc handler-local state.
  • Exclusive authority transfer belongs to MoveOwned handles, owner tokens, or transfer records rather than shared mutable rewrites.
  • Effect handlers may implement capabilities, but parity-critical mutation and publication should remain capability-gated in the exposed API shape.

Practical implications:

  • define trait methods so callers can preserve typed ownership and typed failure
  • do not use effect handlers as a loophole for bypassing authority checks
  • do not let observation-facing layers gain semantic mutation power through convenience helpers
  • ensure long-running effect-driven flows report typed terminal outcomes rather than implicit success or silent hangs
  • prefer the canonical aura-core ownership vocabulary at effect-facing boundaries:
    • OperationContext for move-owned workflow ownership
    • exact progress/terminal publication wrappers plus consumed TerminalPublisher for lifecycle
    • OwnedTaskSpawner, OwnedShutdownToken, and BoundedActorIngress for actor-owned task and ingress boundaries

Effect Traits

Aura defines effect traits as abstract interfaces for system capabilities. Core traits expose essential functionality. Extended traits expose optional operations and coordinated behaviors. Each trait is independent and does not assume global state.

Core traits include CryptoCoreEffects, NetworkCoreEffects, StorageCoreEffects, time domain traits, RandomCoreEffects, and JournalEffects. Extended traits include CryptoExtendedEffects, NetworkExtendedEffects, StorageExtendedEffects, RandomExtendedEffects, and system-level traits such as SystemEffects and ChoreographicEffects.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait CryptoCoreEffects: RandomCoreEffects + Send + Sync {
    async fn ed25519_sign(
        &self,
        message: &[u8],
        private_key: &[u8],
    ) -> Result<Vec<u8>, CryptoError>;

    async fn ed25519_verify(
        &self,
        message: &[u8],
        signature: &[u8],
        public_key: &[u8],
    ) -> Result<bool, CryptoError>;

    async fn kdf_derive(
        &self,
        ikm: &[u8],
        salt: &[u8],
        info: &[u8],
        output_len: u32,
    ) -> Result<Vec<u8>, CryptoError>;
}
}

This example shows a core effect trait for cryptographic operations. Traits contain async methods for compatibility with async runtimes. Extension traits add optional capabilities without forcing all handlers to implement them. The hash() function is intentionally pure in aura-core::hash rather than an effect because it is deterministic and side-effect-free.

Time Traits

The legacy monolithic TimeEffects trait is replaced by domain-specific traits. PhysicalTimeEffects returns wall-clock time with uncertainty and provides sleep operations. LogicalClockEffects advances and reads causal vector clocks and Lamport scalars. OrderClockEffects produces opaque total order tokens without temporal meaning.

Callers select the domain appropriate to their semantics. Guards and transport use physical time. CRDT operations use logical clocks. Privacy-preserving ordering uses order tokens.

Cross-domain comparisons are explicit via TimeStamp::compare(policy). Total ordering across domains must use OrderTime or consensus sequencing. Direct SystemTime::now() or chrono usage is forbidden outside effect implementations.

Time Domain System

The unified TimeStamp type provides five domains for different use cases.

DomainEffect TraitPrimary Use
PhysicalClockPhysicalTimeEffectsWall time, cooldowns, receipts, liveness
LogicalClockLogicalClockEffectsCausal ordering, CRDT merge, happens-before
OrderClockOrderClockEffectsDeterministic ordering without timing leakage
RangePhysicalTimeEffects + policyValidity windows with bounded skew
ProvenancedTimeTimeComparisonAttested timestamps for consensus

Application code accesses time exclusively through effect traits. Direct SystemTime::now() calls are forbidden outside effect implementations. Cross-domain comparisons require explicit TimeStamp::compare(policy). OrderClock must not leak timing information.

Timeout And Backoff Contract

Timeout and backoff behavior is part of the ownership model. Each parity-critical async path must identify a deadline owner, retry policy owner, and terminal consequence. See Effects and Handlers Guide for timeout patterns and Ownership Model for timeout ownership rules.

Threshold Signing

Aura provides a unified ThresholdSigningEffects trait in aura-core/src/effects/threshold.rs for all threshold signing scenarios. The trait supports multi-device personal signing, guardian recovery approvals, and group operation approvals.

The trait uses a unified SigningContext that pairs a SignableOperation with an ApprovalContext. This design allows the same FROST signing machinery to handle all scenarios with proper audit context. The ThresholdSigningService in aura-agent provides the production implementation.

Key components include ThresholdSigningEffects for async signing operations, lifecycle traits for provisional and consensus modes, and AppCore.sign_tree_op() for high-level signing. See Cryptography for detailed threshold signature architecture.

When to Create Effect Traits

New effect traits are warranted when multiple crates need the same async capability and the operation involves OS integration or external state. Convenience wrappers belong in composite extension traits rather than new base traits. See Effects and Handlers Guide for trait design guidance.

Database Effects

Database operations use existing StorageEffects, JournalEffects, and AuthorizationEffects rather than a dedicated DatabaseEffects trait. See Database Architecture for query patterns.

Handler Design

Handlers are stateless per-request processors. They do not store global state. The handler taxonomy includes stateless infrastructure handlers, application handlers, typed composite handlers, and type-erased composite handlers. See Effects and Handlers Guide for implementation patterns.

Unified Encrypted Storage

Aura uses StorageEffects as the single persistence interface in application code. The production runtime wires StorageEffects through a unified encryption-at-rest wrapper. FilesystemStorageHandler provides raw bytes persistence. RealSecureStorageHandler uses Keychain or TPM for master-key persistence.

EncryptedStorage implements StorageEffects by encrypting and decrypting transparently. It generates or loads the master key on first use. Runtime assembly remains synchronous.

#![allow(unused)]
fn main() {
use aura_effects::{
    EncryptedStorage, EncryptedStorageConfig, FilesystemStorageHandler,
    RealCryptoHandler, RealSecureStorageHandler,
};
use std::sync::Arc;

let secure = Arc::new(RealSecureStorageHandler::with_base_path(base_path.clone()));
let storage = EncryptedStorage::new(
    FilesystemStorageHandler::from_path(base_path.clone()),
    Arc::new(RealCryptoHandler::new()),
    secure,
    EncryptedStorageConfig::default(),
);
}

This example shows the encryption wrapper assembly. RealCryptoHandler lives in aura-effects and implements CryptoCoreEffects. Storage configuration controls encryption enablement and opaque naming. Application code uses StorageEffects without knowledge of encryption details.

Context Model

The effect system propagates an EffectContext through async tasks. The context carries authority identity, context scope, session identification, execution mode, and metadata. No ambient state exists.

#![allow(unused)]
fn main() {
pub struct EffectContext {
    authority_id: AuthorityId,
    context_id: ContextId,
    session_id: SessionId,
    execution_mode: ExecutionMode,
    metadata: HashMap<String, String>,
}
}

This structure defines the operation-scoped effect context. The context flows through all effect calls and identifies which authority, context, and session the operation belongs to. The execution_mode controls handler selection for production versus test environments.

Context propagation uses scoped execution. A task local stores the current context. Nested tasks inherit the context. This ensures consistent behavior across async boundaries.

ReactiveEffects Trait

The ReactiveEffects trait provides type-safe signal-based state management. Signals are phantom-typed identifiers that reference reactive state. The phantom type ensures compile-time type safety.

#![allow(unused)]
fn main() {
pub struct Signal<T> {
    id: SignalId,
    _phantom: PhantomData<T>,
}

#[async_trait]
pub trait ReactiveEffects: Send + Sync {
    async fn read<T>(&self, signal: &Signal<T>) -> Result<T, ReactiveError>
    where T: Clone + Send + Sync + 'static;

    async fn emit<T>(&self, signal: &Signal<T>, value: T) -> Result<(), ReactiveError>
    where T: Clone + Send + Sync + 'static;

    fn subscribe<T>(&self, signal: &Signal<T>) -> Result<SignalStream<T>, ReactiveError>
    where T: Clone + Send + Sync + 'static;

    async fn register<T>(&self, signal: &Signal<T>, initial: T) -> Result<(), ReactiveError>
    where T: Clone + Send + Sync + 'static;
}
}

The trait defines four core operations for reactive state. The read method returns the current value. The emit method updates the value. The subscribe method returns a stream of changes. The register method initializes a signal with a default value. See Runtime for reactive scheduling implementation. Subscribing an unregistered signal fails fast. Aura no longer permits "dead stream" subscription success for missing registrations.

QueryEffects Trait

The QueryEffects trait provides typed Datalog queries with capability-based authorization. Queries implement the Query trait which defines typed access to journal facts.

#![allow(unused)]
fn main() {
pub trait Query: Send + Sync + Clone + 'static {
    type Result: Clone + Send + Sync + Default + 'static;

    fn to_datalog(&self) -> DatalogProgram;
    fn required_capabilities(&self) -> Vec<QueryCapability>;
    fn dependencies(&self) -> Vec<FactPredicate>;
    fn parse(bindings: DatalogBindings) -> Result<Self::Result, QueryParseError>;
    fn query_id(&self) -> String;
}

#[async_trait]
pub trait QueryEffects: Send + Sync {
    async fn query<Q: Query>(&self, query: &Q) -> Result<Q::Result, QueryError>;
    async fn query_raw(&self, program: &DatalogProgram) -> Result<DatalogBindings, QueryError>;
    fn subscribe<Q: Query>(&self, query: &Q) -> QuerySubscription<Q::Result>;
    async fn check_capabilities(&self, caps: &[QueryCapability]) -> Result<(), QueryError>;
    async fn invalidate(&self, predicate: &FactPredicate);
}
}

The Query trait converts queries to Datalog programs and defines capability requirements. The QueryEffects trait executes queries and manages subscriptions. Query isolation levels control consistency requirements. See Database Architecture for complete query system documentation.

Determinism Rules

Effect boundaries determine native and WASM conformance parity. Protocol code must follow these rules to ensure deterministic execution.

The pure transition core requires identical outputs given the same input stream. No hidden state may affect observable behavior. All state must flow through explicit effect calls. Non-determinism is permitted only through explicit algebraic effects. Time comes from time traits. Randomness comes from RandomEffects. Storage comes from StorageEffects.

Conformance lanes compare logical steps rather than wall-clock timing. Time-dependent behavior uses simulated time through effect handlers. Conformance artifacts use canonical encoding with deterministic field ordering. See Testing Guide for conformance testing patterns.

Session-Local VM Bridge Effects

Production choreography execution uses a narrow synchronous bridge trait at the Aura and Telltale boundary. VmBridgeEffects in aura-core exposes only immediate session-local queue and snapshot operations. It does not expose async transport, storage, or journal methods.

This split exists because Telltale host callbacks are synchronous. The callback path may enqueue outbound payloads, record blocked receive edges, consume branch choices, and snapshot scheduler signals. It must not perform network I/O or journal work directly.

Async host work resumes outside the VM step boundary in Layer 6 runtime services. vm_host_bridge observes VmBridgeEffects state, performs transport and guard-chain work, and injects completed results back into the VM. This preserves deterministic VM progression while keeping Aura's runtime async.

aura-agent runtime code preserves this boundary through canonical ingress and explicit session ownership. Network callbacks, timers, and background tasks route typed session-ingress messages to the current local owner. Each active session has exactly one owner at any time.

That owner may be hosted by an actor, but the effect-routing rule is still ownership-based: session-bound effects execute because the caller is the current owner, not merely because it runs inside a service actor.

The runtime must also distinguish owner identity from owner capability:

  • owner identity identifies the current fragment/session owner
  • owner capability authorizes specific session-bound effects within fragment scope

Both checks matter for effect routing, especially across delegation boundaries.

See System Internals Guide for VM bridge implementation patterns and crates/aura-agent/ARCHITECTURE.md for the complete ownership model.

Layer Placement

The effect system spans several crates with strict dependency boundaries. aura-core defines effect traits, identifiers, and core data structures. It contains no implementations.

aura-effects contains stateless and single-party effect handlers. It provides default implementations for cryptography, storage, networking, and randomness. aura-protocol contains orchestrated and multi-party behavior. It bridges session types to effect calls.

aura-agent assembles handlers into runnable systems. It configures effect pipelines for production environments. aura-simulator provides deterministic execution with simulated time, networking, and controlled failure injection.

Performance

Aura includes several performance optimizations. Parallel initialization reduces startup time. Caching handlers reduce repeated computation. Buffer pools reduce memory allocation. The effect system avoids OS threads for WASM compatibility.

#![allow(unused)]
fn main() {
let builder = EffectSystemBuilder::new()
    .with_handler(Arc::new(RealCryptoHandler))
    .with_parallel_init();
}

This snippet shows parallel initialization of handlers. The builder pattern allows flexible handler composition. Lazy initialization creates handlers on first use. Async tasks and cooperative scheduling provide efficient execution.

Testing Support

The effect system supports deterministic testing through mock handlers. A simulated runtime provides control over time and network behavior. The simulator exposes primitives to inject delays or failures.

#![allow(unused)]
fn main() {
let system = TestRuntime::new()
    .with_mock_crypto()
    .with_deterministic_time()
    .build();
}

This snippet creates a test runtime with mock handlers for all effects. It provides deterministic time and network control. Tests use in-memory storage and mock networking to execute protocols without side effects. See Test Infrastructure Reference for test patterns.

Runtime

Overview

The Aura runtime assembles effect handlers into working systems. It manages lifecycle, executes the guard chain, schedules reactive updates, and exposes services through AuraAgent. The AppCore provides a unified interface for all frontends.

This document covers runtime composition and execution. See Effect System for trait definitions and handler design. See Ownership Model for the repo-wide ownership taxonomy. The aura-agent crate-level runtime contract, including structured concurrency, canonical ingress, ownership, typed errors, and CI policy gates, lives in crates/aura-agent/ARCHITECTURE.md.

That contract is intentionally opinionated about the split of responsibilities:

  • actor services own long-lived runtime supervision, lifecycle, and maintenance
  • move semantics own session and endpoint ownership transfer

Those are related concerns, but they are not the same abstraction boundary.

For shared semantic operations, the split is stricter still:

  • aura-app::workflows owns authoritative semantic lifecycle publication
  • aura-agent owns long-lived runtime actors and readiness/state coordination
  • frontend crates and the harness submit through sanctioned handoff boundaries and observe authoritative publication afterward

No runtime, frontend, or harness path should keep a parallel terminal publication helper once the shared workflow owner has taken over.

The same visibility rule applies to runtime-owned mutation helpers. Raw VM admission helpers, fragment ownership registry mutation, and the mutable reconfiguration controller stay inside aura-agent runtime internals. Shared consumers go through sanctioned ingress, session-owner, or manager surfaces.

Adaptive Privacy Runtime Ownership

Adaptive privacy policy is runtime-owned local state, not shared truth.

  • the Neighborhood Plane and Web of Trust Plane provide permit and candidate inputs
  • rendezvous descriptor views provide service-surface advertisement inputs
  • SelectionManagerService fuses those inputs with local health and budget signals into runtime-local LocalSelectionProfile and SelectionState
  • LocalHealthObserverService owns smoothing, hysteresis, and local health snapshots
  • MoveManager owns bounded movement queues, replay windows, flush scheduling, congestion state, and the current routing profile used for MoveEnvelope delivery
  • HoldManager owns held-object custody, selector rotation, bounded holder residency, local Hold GC, verified witness handling, and neighborhood-scoped provider scoring
  • AnonymousPathManager owns reusable anonymous established-path lifecycle and protected encrypted establish-session state
  • CoverTrafficGeneratorService owns cover-floor planning and reserved cover budget

This split is strict:

  • final route, holder, and path reuse decisions are runtime-local
  • those decisions must not be published as authoritative facts or descriptor fields
  • retrieval, cover, accountability replies, and ordinary movement share one MoveEnvelope family where applicable rather than regaining separate transport families

The production adaptive policy is fixed by deployment rather than user configuration. The current fixed policy uses path-diversity floor 2, cover floor 2 packets per second, delay gain denominator 3, neighborhood hold retention window 120s, and retrieval-capability rotation beginning 10s before expiry. Development and simulation may tune those values, but production nodes do not expose per-user privacy knobs.

LocalRoutingProfile::passthrough() remains the pre-privacy baseline. It uses mixing depth 0, delay 0, cover rate 0, and path diversity 1. Hold remains active under passthrough because custody and selector retrieval are availability services rather than privacy-profile parameters.

Transparent Onion Quarantine

transparent_onion is a debug, test, and simulation tool only.

  • it may expose transparent anonymous setup and envelope headers for inspection
  • the production runtime path is encrypted; transparent objects exist only on the explicit debug/simulation feature surface
  • it must remain excluded from parity-critical harness and shared-flow lanes
  • production policy ownership does not change when the feature is enabled
  • the feature must not become an implicit dependency of browser, TUI, or conformance behavior

Ownership Categories In The Runtime

The runtime is the main place where Aura's ownership categories become concrete:

  • long-lived runtime services, supervisors, readiness coordinators, and caches are ActorOwned
  • session, endpoint, and delegation transfer surfaces are MoveOwned
  • runtime views, projections, and exported state are Observed
  • reducers, validators, and typed contracts remain Pure

Two runtime rules follow from that split:

  1. Actor mailboxes are for mutation of actor-owned state, not as a substitute for move-style ownership transfer.
  2. Runtime-facing lifecycle and readiness publication should be capability-gated and should terminate explicitly with typed success, failure, or cancellation.
  3. Long-lived mutable async domains should be declared through #[aura_macros::actor_owned(...)], and small parity-critical runtime lifecycle enums should prefer #[aura_macros::ownership_lifecycle(...)] over hand-written transition helpers.

Instrumentation Contract

All long-lived runtime services emit structured events from the following families: runtime startup/shutdown, service lifecycle transition, task spawn/completion/failure/abort, session claim/release/failure, ingress accepted/rejected/dropped, delegation start/commit/rollback/reject, link boundary route/reject, concurrency profile select/fallback, and invariant violation.

Required fields include service, task, session_id, fragment_key, owner, from_owner, to_owner, profile, error_kind, and correlation_id where applicable. These families ensure that runtime behavior is reconstructible from structured logs. Envelope admission, delegation witnesses, and fallback decisions must all be visible in instrumentation output.

Structured Concurrency

aura-agent uses structured concurrency as the only production async model.

Rules:

  • Every long-lived async subsystem has one named owner.
  • Every owner has one rooted task group.
  • Child tasks belong to exactly one task group.
  • Detached fire-and-forget tasks are forbidden in production runtime code.
  • Shutdown is hierarchical and parent-driven.

Runtime shutdown ordering remains an orchestration-level invariant, not a compile-time type property. Aura keeps a targeted integration check for the final shutdown sequence in aura-agent::runtime::system:

  1. stop the reactive pipeline
  2. cancel the runtime task tree
  3. tear down services
  4. shut down the lifecycle manager

That check is governance for the final runtime owner graph, not a replacement for the compile-time ownership model.

See System Internals Guide for implementation patterns and preferred primitives.

Lifecycle Management

aura-agent uses an explicit service lifecycle contract with authoritative service states, structured task ownership, and deterministic teardown. The crate-level runtime contract in crates/aura-agent/ARCHITECTURE.md is the source of truth.

All long-lived services implement a shared lifecycle state machine:

  • new: Initial state before startup.
  • starting: Initialization in progress.
  • Running: Actor alive and command path available.
  • stopping: Graceful shutdown in progress.
  • Stopped: No live owned tasks and no live command handling.
  • Failed: Observable failure state.

Long-lived runtimes periodically prune caches and stale in-memory state through owned service actors and supervised task groups. Domain crates expose cleanup APIs but do not self-schedule. The agent runtime owns the scheduling model.

See System Internals Guide for service lifecycle implementation.

Runtime Timeout Policy

Runtime timeout behavior must preserve Aura's time-system contract:

  • physical time drives local waiting, retry, and backoff policy
  • logical, order, and provenanced time remain semantic ordering tools
  • runtime owners publish typed timeout failure when local waiting is exhausted
  • harness and simulation may scale timeout policy, but they should not invent a different semantic model

In practice this means:

  • long-lived owners should consume a remaining timeout budget across nested stages instead of resetting fresh wall-clock literals at each call site
  • retry loops should use shared backoff policy rather than duplicated sleeps
  • timeout policy belongs to owner/coordinator code, not UI observation layers
  • reducing timeout duration in tests or harness mode is acceptable. Changing what timeout means is not.
  • runtime-facing workflow/task boundaries should carry OperationTimeoutBudget, OwnedShutdownToken, and OwnedTaskSpawner rather than raw Duration, raw cancellation traits, or ad hoc spawn helpers
  • parity-critical runtime waits should consume strong typed authoritative references once context is known. They must not re-derive ownership or context from weaker ids inside later readiness/wait helpers.

Runtime Authority Discipline

Runtime-owned coordinators follow the same authority rule as app workflows:

  • resolve authoritative typed input once at the boundary
  • carry that typed input through later parity-critical readiness, retry, and terminal steps
  • do not re-resolve context from raw ids after authoritative handoff
  • do not keep fallback/default repair helpers on parity-critical paths

If a later step needs context, the API should require the strong typed reference rather than accepting a raw identifier and looking it up again.

Guard Chain Execution

The runtime enforces guard chain sequencing defined in Authorization. Each projected choreography message expands to three phases. First, snapshot preparation gathers capability frontier, budget headroom, and metadata. Second, pure guard evaluation runs synchronously over the snapshot. Third, command interpretation executes the resulting effect commands.

flowchart LR
    A[Snapshot prep] -->|async| B[Guard eval]
    B -->|sync| C[Interpreter]
    C -->|async| D[Transport]

This diagram shows the guard execution flow. Snapshot preparation is async. Guard evaluation is pure and synchronous. Command interpretation is async and performs actual I/O.

GuardSnapshot

The runtime prepares a GuardSnapshot immediately before entering the guard chain. It contains every stable datum a guard may inspect while remaining read-only.

#![allow(unused)]
fn main() {
pub struct GuardSnapshot {
    pub now: TimeStamp,
    pub caps: Cap,
    pub budgets: FlowBudgetView,
    pub metadata: MetadataView,
    pub rng_seed: [u8; 32],
}
}

Guards evaluate synchronously against this snapshot and the incoming request. They cannot mutate state or perform I/O. This keeps guard evaluation deterministic, replayable, and WASM-compatible.

EffectCommands

Guards do not execute side effects directly. Instead, they return EffectCommand items for the interpreter to run. Each command is a minimal description of work.

#![allow(unused)]
fn main() {
pub enum EffectCommand {
    ChargeBudget {
        context: ContextId,
        authority: AuthorityId,
        peer: AuthorityId,
        amount: FlowCost,
    },
    AppendJournal { entry: JournalEntry },
    RecordLeakage { bits: u32 },
    StoreMetadata { key: String, value: String },
    SendEnvelope {
        to: NetworkAddress,
        peer_id: Option<uuid::Uuid>,
        envelope: Vec<u8>
    },
    GenerateNonce { bytes: usize },
}
}

Commands describe what happened rather than how. Interpreters can batch, cache, or reorder commands as long as the semantics remain intact. This vocabulary keeps the guard interface simple.

EffectInterpreter

The EffectInterpreter trait encapsulates async execution of commands. Production runtimes hook it to aura-effects handlers. The simulator hooks deterministic interpreters that record events instead of hitting the network.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait EffectInterpreter: Send + Sync {
    async fn execute(&self, cmd: EffectCommand) -> Result<EffectResult>;
    fn interpreter_type(&self) -> &'static str;
}
}

ProductionEffectInterpreter performs real I/O for storage, transport, and journal. SimulationEffectInterpreter records deterministic events and consumes simulated time. This design lets the guard chain enforce authorization, flow budgets, and journal coupling without leaking implementation details.

Reactive Scheduling

The ReactiveScheduler in aura-agent processes journal facts and drives UI signal updates. It receives facts from multiple sources including journal commits, network receipts, and timers. It batches them in a 5ms window and drives all signal updates.

Intent → Fact Commit → FactPredicate → Query Invalidation → Signal Emit → UI Update

This flow shows how facts propagate to UI. Services emit facts rather than directly mutating view state. The scheduler processes fact batches and updates registered signal views. This eliminates dual-write bugs where different signal sources could desync.

Signal Views

Domain signals are driven by signal views in the reactive scheduler. ChatSignalView, ContactsSignalView, and InvitationsSignalView process facts and emit full state snapshots to their respective signals.

#![allow(unused)]
fn main() {
// Define application signals
pub static CHAT_SIGNAL: LazyLock<Signal<ChatState>> =
    LazyLock::new(|| Signal::new("app:chat"));

// Bind signal to query at initialization
pub async fn register_app_signals_with_queries<R: ReactiveEffects>(
    handler: &R,
) -> Result<(), ReactiveError> {
    handler.register_query(&*CHAT_SIGNAL, ChatQuery::default()).await?;
    Ok(())
}
}

This example shows signal definition and query binding. Signals are defined as static lazy values. They are bound to queries during initialization. When facts change, queries invalidate and signals update automatically.

Fact Processing

The scheduler integrates with the effect system through fact sinks. Facts flow from journal commits through the scheduler to signal views.

#![allow(unused)]
fn main() {
// In RuntimeSystem (aura-agent)
effect_system.attach_fact_sink(pipeline.fact_sender());

// The scheduler processes fact batches and updates signal views.
}

Terminal screens subscribe and automatically receive updates. This enables push-based UI updates without polling.

UnifiedHandler

The UnifiedHandler composes Query and Reactive effects into a single cohesive handler. It holds a QueryHandler, a shared ReactiveHandler, and an optional capability context.

The commit_fact method adds a fact and invalidates affected queries. The query method checks capabilities and executes the query. BoundSignal<Q> pairs a signal with its source query for registration and invalidation tracking.

Service Pattern

The runtime uses a three-tier service architecture. Domain crates define stateless handlers that produce pure GuardOutcome plans without performing I/O. The agent layer wraps these handlers with services that gather snapshots, run guard evaluation, and interpret effect commands. The agent exposes services through typed accessor methods on AuraAgent, with a ServiceRegistry holding Arc references initialized at startup.

See System Internals Guide for the handler/service/API implementation pattern.

Session Management

The runtime manages the lifecycle of distributed protocols. Choreographies define protocol logic. Sessions represent single stateful executions of choreographies. The runtime uses structured concurrency with explicit session ownership.

Session Ownership

Each active session or fragment has exactly one current local owner. The owner is either a per-session actor or an authoritative choreography runtime loop. This invariant is enforced through the canonical ingress pattern.

This is the move-semantics side of the runtime model:

  • one current owner
  • explicit transfer
  • stale-owner rejection
  • owner-routed session effects

Owner identity and capability are separate:

  • ownership says who currently controls the fragment
  • capability says what fragment-scoped work that owner may perform

Network, timer, and external events are queued before touching session state. Session ownership and task ownership move together. Session-bound effects execute only under the current owner.

The owner may be implemented by an actor, but the transfer of ownership is still an explicit move boundary rather than shared actor state.

See Choreography Development Guide for session ownership implementation.

Ownership Transitions

Owner-visible state transitions:

  • Unowned -> Claimed
  • Claimed -> Running
  • Running -> DelegatingOut
  • DelegatingOut -> Released
  • Running -> Stopping
  • Stopping -> Stopped
  • Any -> Failed

No transition may create overlapping owners.

Session Interface

The SessionManagementEffects trait provides the abstract interface for all session operations. Application logic remains decoupled from the underlying implementation. Sessions can use in-memory or persistent state.

Session State

Concrete implementations act as the engine for the session system. Each session maintains:

  • SessionId for unique identification.
  • SessionStatus indicating the current phase.
  • Epoch for coordinating state changes.
  • Participant list.

Session creation and lifecycle are managed as choreographic protocols. The SessionLifecycleChoreography in aura-protocol ensures consistency across all participants.

Telltale Integration

Aura executes production choreography sessions through the Telltale protocol machine in Layer 6. Production startup is manifest-driven. Generated CompositionManifest metadata defines the protocol id, required capabilities, determinism profile reference, link constraints, and delegation constraints for each choreography. AuraChoreoEngine runs admitted protocol-machine sessions and exposes deterministic trace, replay, and envelope-validation APIs.

Aura is aligned with Telltale 10.0.0's public runtime model rather than a private compatibility surface. Runtime admission, canonical finalization, semantic handoff, and runtime-upgrade execution all use public protocol-machine concepts. Delegation witnesses now carry Telltale-native ownership receipts and post-upgrade reconfiguration snapshots rather than Aura-local transition wrappers. Fail-closed receipt and authority handling is explicit at Aura's runtime boundaries, and timeout expiry is modeled from issued timeout witnesses rather than from late elapsed-time inference. For the upstream capability/finalization/runtime-upgrade contract, read Telltale docs/38_capability_model.md.

Runtime ownership is fragment-scoped. One admitted protocol fragment has one local owner at a time. A choreography without link metadata is one fragment. A choreography with link metadata yields one fragment per linked bundle. Ownership claims, transfer, and release flow through AuraEffectSystem and ReconfigurationManager.

This is why the runtime uses both abstractions at once:

  • actor services for host-side runtime structure
  • explicit move-style ownership for fragment/session transfer

When delegation changes ownership, the runtime must also define whether the moved capability is transferred intact or attenuated to a narrower scope. That decision is part of the protocol/runtime contract, not a host-side convenience choice.

The synchronous callback boundary is VmBridgeEffects. AuraVmEffectHandler and AuraQueuedVmBridgeHandler use it for session-local payload queues, blocked receive snapshots, branch choices, and scheduler signals. Async transport, guard-chain execution, journal coupling, and storage remain outside protocol-machine callbacks in vm_host_bridge and service loops.

Dynamic reconfiguration follows the same rule. Runtime code must go through ReconfigurationManager for link and delegation so bundle evidence, capability admission, and coherence checks are enforced before any transfer occurs.

Dynamic reconfiguration also carries typed upgrade artifacts end to end. When a delegation also performs a runtime upgrade, Aura persists the delegation fact, records the typed upgrade request/execution pair, and rejects missing source ownership or invalid upgrade evidence rather than repairing state implicitly.

Runtime Profiles

Telltale protocol-machine execution is configured through two profile axes:

  • AuraVmHardeningProfile controls safety posture: Dev (assertions + full trace), Ci (strict allow-lists + replay), Prod (safety checks with bounded overhead).
  • AuraVmParityProfile controls deterministic cross-target lanes: NativeCooperative (native baseline) and WasmCooperative (WASM lane), both using cooperative scheduling and strict effect determinism.

Determinism and scheduler policy are protocol-driven. Admission resolves the protocol class, applies the configured determinism tier and replay mode, validates the selected runtime profile, and chooses scheduler controls. Production code should not mutate these settings directly after admission.

Mixed workloads are allowed. Cooperative and threaded fragments may coexist in the same runtime. The contract is per fragment.

See System Internals Guide for VM configuration patterns.

Boundary Review Checklist

Changes to VM/Aura boundaries require review against the checklist in System Internals Guide.

Concurrency Profiles and Envelope Admission

aura-agent recognizes three runtime concurrency profiles for choreography work:

  • Canonical: Exact single-owner reference path. Telltale canonical execution at concurrency n = 1 is the reference behavior.
  • EnvelopeAdmitted: Disjoint or admitted work preserving safety-visible meaning. Higher concurrency is a refinement only when it stays inside the admitted envelope relation.
  • Fallback: Immediate degradation to canonical execution when envelope admission fails.

Correctness never depends on uncontrolled host scheduling. If the runtime cannot show that a path is envelope-safe, it serializes execution.

Envelope Admission Contract

Operational envelope admission is a runtime gate, not a comment-level convention. The runtime must record and enforce:

  • which determinism / concurrency profile was requested
  • which evidence or certificate admitted the profile
  • whether execution stayed canonical or entered an admitted refinement
  • why fallback occurred when admission failed

Safety-visible observations must remain equivalent to the canonical reference. Every admitted step must have a declared witness path. Profile-side obligations must be checked before execution widens.

link is a static composition boundary. Linked bundles define ownership boundaries as well as composition boundaries. Linked protocols remain session-disjoint unless composition explicitly shares state. Cross-boundary effect routing is explicit. Ad hoc shared mutable state across linked boundaries is forbidden. link must preserve Telltale coherence and harmony obligations at runtime, not just compile-time compatibility.

A boundary object must carry enough information to answer:

  • which bundle/fragment boundary this effect belongs to
  • which owner capability scope is valid at that boundary
  • whether a route crosses a boundary that requires explicit reconfiguration handling

Wrong-boundary routing is a runtime error and must be rejected before the VM observes the step.

Delegate Boundary

delegate is an ownership-transfer boundary. Endpoint/session ownership transfer is atomic. Capability/effect context transfers with the endpoint. Stale-owner access after delegation is forbidden. Ambiguous local ownership is rejected before the VM observes the transfer. Fragment ownership and session footprint state move with the transfer rather than lagging behind it.

Transfer and attenuation are separate concepts:

  • transfer changes the authoritative owner
  • attenuation narrows the capability scope that moves with the new owner

If the runtime cannot state which one is happening and under which protocol rule, it must reject the delegation path.

A successful delegation must move one owned bundle: session owner record, owner capability, protocol fragment ownership, runtime footprint / reconfiguration state, and delegation audit witness. If these do not move together, the transfer is incomplete and must be treated as a runtime error.

Theorem-pack / Invariant Alignment

The runtime must preserve coherence-sensitive session and edge state, harmony-sensitive reconfiguration steps, adequacy-relevant observable traces, determinism-profile obligations, and replay / communication identity stability across async ingress. Advanced runtime modes should be capability- and evidence-gated. Missing invariant evidence must cause rejection or fallback, never silent widening.

Fact Registry

The FactRegistry provides domain-specific fact type registration and reduction for reactive scheduling. It lives in aura-journal and is integrated via AuraEffectSystem::fact_registry(). Registered domains include Chat for message threading, Invitation for device invitations, Contact for relationship management, and Moderation for home and mute facts.

Reactive subscription policy is explicit:

  • subscribing to an unregistered signal fails fast with ReactiveError::SignalNotFound
  • there is no implicit wait-for-registration or dead-stream fallback
  • subscriber delivery is eventually consistent rather than lossless
  • if a subscriber lags behind the bounded broadcast buffer, intermediate values may be dropped and the subscriber resumes from a newer snapshot after a lag warning is logged

Production code obtains the registry via effect_system.fact_registry(). Tests may use build_fact_registry() for isolation. The registry assembly stays in Layer 6 rather than Layer 1.

See Effects and Handlers Guide for fact registration patterns.

Delivery Policy

The DeliveryPolicy trait enables domain crates to define custom acknowledgment retention and garbage collection behavior. This keeps policy decisions in the appropriate layer while the journal provides generic ack storage.

Domain crates implement the trait to control acknowledgment lifecycle. Key methods include min_retention and max_retention for time bounds, requires_ack to check tracking needs, and can_gc to determine GC eligibility.

#![allow(unused)]
fn main() {
let chat_policy = ChatDeliveryPolicy {
    min_retention: Duration::from_secs(30 * 24 * 60 * 60),
    max_retention: Duration::from_secs(90 * 24 * 60 * 60),
};
effect_system.register_delivery_policy("chat", Arc::new(chat_policy));
}

This example shows domain-specific retention configuration. Chat messages retain acks for 30 to 90 days. Different domains can have vastly different retention needs. The maintenance system uses registered policies during GC passes.

AppCore

The AppCore in aura-app provides a unified portable interface for all frontend platforms. It is headless and runtime-agnostic. It can run without a runtime bridge for offline or demo modes. It can be wired to a concrete runtime via the RuntimeBridge trait.

Architecture

AppCore sits between frontends and a runtime bridge.

flowchart TB
    subgraph Frontends
        A[TUI]
        B[CLI]
        C[iOS]
        D[Web]
    end
    A --> E[AppCore]
    B --> E
    C --> E
    D --> E
    E --> F[RuntimeBridge]
    F --> G[AuraAgent]

This diagram shows the AppCore architecture. Frontends import UI-facing types from aura-app. They may additionally depend on a runtime crate to obtain a concrete RuntimeBridge. This keeps aura-app portable while allowing multiple runtime backends.

Construction Modes

AppCore supports two construction modes: demo/offline mode (no runtime bridge, for development and testing) and production mode (wired to a concrete RuntimeBridge for full effect system capabilities).

See Hello World Guide for AppCore usage.

Reactive Flow

All state changes flow through the reactive pipeline. Services emit facts rather than directly mutating view state. UI subscribes to signals using signal.for_each(). This preserves push semantics and avoids polling.

Local Intent ───┐
                │
Service Result ─┼──► Fact ──► Journal ──► Reduce ──► ViewState
                │                                      │
Remote Sync ────┘                                      ↓
                                               Signal<T> ──► UI

This flow shows the push-based reactive model. Facts from any source flow through the journal. Reduction computes view state. Signals push updates to UI subscribers.

Runtime Access

When AppCore has a runtime, it provides access to runtime-backed operations through the RuntimeBridge trait. The runtime bridge exposes async capabilities while keeping aura-app decoupled from any specific runtime implementation. Frontends import app-facing types from aura-app and runtime types from aura-agent directly.

Journal

This document describes the journal architecture and state reduction system in Aura. It explains how journals implement CRDT semantics, how facts are structured, and how reduction produces deterministic state for account authorities and relational contexts.

It describes the reduction pipeline, flow budget semantics, and integration with the effect system. It defines the invariants that ensure correctness. See Maintenance for the end-to-end snapshot and garbage collection pipeline.

Hybrid Journal Model (Facts + Capabilities)

Aura’s journal state is a composite of:

  • Fact Journal: the canonical, namespaced CRDT of immutable facts.
  • Capability Frontier: the current capability lattice for the namespace.
  • Composite journal view: the runtime carries facts and capabilities together when evaluating effects.

The fact journal is stored and merged as a semilattice. Capabilities are refined via meet. The runtime always treats these as orthogonal dimensions of state.

1. Journal Namespaces

Aura maintains a separate journal namespace for each authority and each relational context. A journal namespace stores all facts relevant to the entity it represents. A namespace is identified by an AuthorityId (see Authority and Identity) or a ContextId and no namespace shares state with another. Identifier definitions appear in Identifiers and Boundaries.

A journal namespace evolves through fact insertion. Facts accumulate monotonically. No fact is removed except through garbage collection rules that preserve logical meaning.

#![allow(unused)]
fn main() {
pub struct Journal {
    pub namespace: JournalNamespace,
    pub facts: BTreeSet<Fact>,
}

pub enum JournalNamespace {
    Authority(AuthorityId),
    Context(ContextId),
}
}

This type defines a journal as a namespaced set of facts. The namespace identifies whether this journal tracks an authority's commitment tree or a relational context. The journal is a join semilattice under set union where merging two journals produces a new journal containing all facts from both inputs. Journals with different namespaces cannot be merged.

2. Fact Model

Facts represent immutable events or operations that contribute to the state of a namespace. Facts have ordering tokens, timestamps, and content. Facts do not contain device identifiers used for correctness.

#![allow(unused)]
fn main() {
pub struct Fact {
    pub order: OrderTime,
    pub timestamp: TimeStamp,
    pub content: FactContent,
}

pub enum FactContent {
    AttestedOp(AttestedOp),
    Relational(RelationalFact),
    Snapshot(SnapshotFact),
    RendezvousReceipt {
        envelope_id: [u8; 32],
        authority_id: AuthorityId,
        timestamp: TimeStamp,
        signature: Vec<u8>,
    },
}
}

The order field provides an opaque, privacy-preserving total order for deterministic fact ordering in the BTreeSet. The timestamp field provides semantic time information for application logic. Facts implement Ord via the OrderTime field. Do not use TimeStamp for cross-domain indexing or total ordering. Use OrderTime or consensus/session sequencing instead.

This model supports account operations, relational context operations, snapshots, and rendezvous receipts. Each fact is self contained. Facts are validated before insertion into a namespace.

2.2 Protocol-Level vs Domain-Level Relational Facts

RelationalFact has only two variants:

  • Protocol(ProtocolRelationalFact): Protocol-level facts that must live in aura-journal because reduction semantics depend on them.
  • Generic { .. }: Extensibility hook for domain facts (DomainFact + FactReducer).

Criteria for ProtocolRelationalFact (all must hold):

  1. Reduction-coupled: the fact directly affects core reduction invariants in reduce_context() (not just a view).
  2. Cross-domain: the fact’s semantics are shared across multiple protocols or layers.
  3. Non-derivable: the state cannot be reconstructed purely via FactReducer + RelationalFact::Generic.

If a fact does not meet all three criteria, it must be implemented as a domain fact and stored via RelationalFact::Generic.

Enforcement:

  • All protocol facts are defined in crates/aura-journal/src/protocol_facts.rs.
  • Any new protocol fact requires a doc update in this section and a matching reduction rule.

2.1 Domain Fact Contract (Checklist + Lint)

Domain facts are the extensibility mechanism for Layer 2 crates. Every domain fact must follow this contract to ensure cross-replica determinism and schema stability:

  • Type ID: define a *_FACT_TYPE_ID constant (unique, registered in crates/aura-agent/src/fact_types.rs).
  • Schema version: specify a schema version (via #[domain_fact(schema_version = N)] or *_FACT_SCHEMA_VERSION).
  • Canonical encoding: use #[derive(DomainFact)] or explicit encode_domain_fact / VersionedMessage helpers.
  • Context derivation: declare context / context_fn for DomainFact or implement a stable context_id() derivation.
  • Reducer registration: provide a FactReducer and register it in the central registry (crates/aura-agent/src/fact_registry.rs).

3. Semilattice Structure

Journals use a join semilattice. The semilattice uses set union as the join operator with partial order defined by subset inclusion. The journal never removes facts during merge. Every merge operation increases or preserves the fact set.

The join semilattice ensures convergence across replicas. Any two replicas that exchange facts eventually converge to identical fact sets. All replicas reduce the same fact set to the same state.

The merge operation asserts namespace equality, unions the fact sets, and returns the combined journal. The result is monotonic and convergent.

4. Reduction Pipeline

Aura maintains two replicated state machines. Account journals describe commitment trees for authorities. Relational context journals describe cross-authority coordination. Both use the same fact-only semilattice and deterministic reducers.

flowchart TD
    A[Ledger append] --> B[Journal merge];
    B --> C[Group by parent];
    C --> D[Resolve conflicts via max hash];
    D --> E[Apply operations in topological order];
    E --> F[Recompute commitments bottom-up];

Ledger append writes facts durably. Journal merge unions the fact set. Reducers group operations by parent commitment, resolve conflicts deterministically using max hash tie-breaking, and then apply winners in topological order. The final step recomputes commitments bottom-up which downstream components treat as the canonical state.

4.1 Fact Production

Account operations originate from local threshold signing or Aura Consensus. Relational context operations always run through Aura Consensus because multiple authorities must agree on the prestate. Each successful operation produces an AttestedOp fact. Receipts that must be retained for accountability are stored as RendezvousReceipt facts scoped to the context that emitted them. Implementations must only emit facts after verifying signatures and parent commitments.

4.2 Determinism Invariants

The reduction pipeline maintains strict determinism:

  1. No HashMap iteration: All maps use BTreeMap for consistent ordering
  2. No system time: OrderTime tokens provide opaque ordering
  3. No floating point: All arithmetic uses exact integer/fixed-point
  4. Pure functions only: Reducers have no side effects

These properties are verified by test_reduction_determinism() which confirms all fact permutations produce identical state.

5. Account Journal Reduction

Account journals store attested operations for commitment tree updates. Reduction computes a TreeStateSummary (epoch, commitment, threshold, device count) from the fact set. See Authority and Identity for structure details. The summary is a lightweight public view that hides internal device structure. For the full internal representation with branches, leaves, and topology, see TreeState in aura-journal::commitment_tree.

Internally, AuthorityTreeState materializes explicit branch topology to support incremental commitment updates. It keeps ordered branch children, branch/leaf parent pointers, and branch depth metadata. The topology is deterministic: active leaves are sorted by LeafId, paired in stable order, and materialized with NodeIndex(0) as root followed by breadth-first branch assignment. This guarantees identical branch structure for identical active leaf sets across replicas.

Commitment recomputation for non-structural mutations is path-local. Aura marks affected branches dirty, collects their paths to root, then recomputes only those branches bottom-up. Structural mutations (AddLeaf, RemoveLeaf) currently use deterministic rebuild of topology and branch commitments (correctness-first baseline). Merkle proof paths are derived from the same materialized topology so proof generation and commitment caches share one source of truth.

Maintenance note: do not change topology construction or branch indexing rules without updating replay fixtures and root commitment fixtures. Determinism depends on stable leaf sort order, stable pair composition, and stable branch index assignment.

Reduction follows these steps:

  1. Identify all AttestedOp facts.
  2. Group operations by their referenced parent state (epoch + commitment).
  3. For concurrent operations (same parent), select winner using max hash tie-breaking: H(op) comparison.
  4. Apply winners in topological order respecting parent dependencies.
  5. Recompute commitments bottom-up after each operation.

The max hash conflict resolution (max_by_key(hash_op)) ensures determinism by selecting a single winner from concurrent operations. The result is a single TreeStateSummary for the account derived by extracting AttestedOp facts and applying them in deterministic order.

6. RelationalContext Journal Reduction

Relational contexts store relational facts. These facts reference authority commitments. Reduction produces a RelationalState that captures the current relationship between authorities.

#![allow(unused)]
fn main() {
pub struct RelationalState {
    pub bindings: Vec<RelationalBinding>,
    pub flow_budgets: BTreeMap<(AuthorityId, AuthorityId, u64), u64>,
    pub leakage_budget: LeakageBudget,
    pub channel_epochs: BTreeMap<ChannelId, ChannelEpochState>,
}

pub struct RelationalBinding {
    pub binding_type: RelationalBindingType,
    pub context_id: ContextId,
    pub data: Vec<u8>,
}
}

This structure represents the reduced relational state. It contains relational bindings, flow budget tracking between authorities, leakage budget totals for privacy accounting, and AMP channel epoch state for message ratcheting.

Reduction processes the following protocol fact types wrapped in Protocol(...):

  1. GuardianBinding maps to RelationalBinding for guardian relationships
  2. RecoveryGrant creates recovery permission bindings between authorities
  3. Consensus stores generic bindings with consensus metadata
  4. AmpChannelCheckpoint anchors channel state snapshots for AMP messaging
  5. AmpProposedChannelEpochBump and AmpCommittedChannelEpochBump track channel epoch transitions
  6. AmpChannelPolicy defines channel-specific policy overrides for skip windows
  7. DkgTranscriptCommit stores consensus-finalized DKG transcripts
  8. ConvergenceCert records soft-safe convergence certificates
  9. ReversionFact tracks explicit reversion events
  10. RotateFact marks lifecycle rotation or upgrade events

Domain-specific facts use Generic { context_id, binding_type, binding_data } and are reduced by registered FactReducer implementations.

Reduction verifies that relational facts reference valid authority commitments and applies them in dependency order.

6.1 AMP Channel Epoch Transition Reduction

AMP channel epoch transition reduction is fact-only and deterministic. It refines the existing A1/A2/A3 operation model for channel epoch state without changing authority-root membership or account commitment tree rules.

Every AMP transition proposal, certificate, commit, abort, conflict, or supersession fact binds the same canonical transition_id when it refers to the same transition. The identity is the typed digest over:

  • context_id
  • channel_id
  • parent_epoch
  • parent_commitment
  • successor_epoch
  • successor_commitment
  • membership_commitment
  • transition_policy

Reducers group transition facts by (context_id, channel_id, parent_epoch, parent_commitment) and validate all facts in the group before exposing live state. Reducer validation covers parent binding, successor epoch validity, membership commitment validity, witness committee validity, A2 certificate thresholds, A3 commit evidence, and abort/conflict/supersession evidence.

For each parent group, reduction exposes one of:

StateLive successor exposed?Meaning
ObservedNoOne or more syntactically valid proposals exist without a live certificate.
A2LiveYesExactly one valid unsuppressed AMP-certified successor exists.
A2ConflictNoMultiple conflicting valid certificates or unresolved equivocation evidence exist.
A3FinalizedYesConsensus-backed commit finalizes one successor.
A3ConflictNoMultiple conflicting durable commits or unresolved durable-commit evidence exist.
AbortedNoExplicit abort evidence suppresses live use.
SupersededDepends on successorAuthorized supersession replaces the earlier transition path.

If exactly one valid unsuppressed A3 commit exists, that transition is both the durable and live successor. Otherwise, if exactly one valid unsuppressed A2 certificate exists, that transition is the live successor but remains non-durable. If conflicting valid A2 certificates exist and facts do not resolve the conflict, reduction exposes no live successor and marks the parent state conflict-tainted.

Reducers must not use local wall-clock time, network connectivity, operator preference, arrival order, or hash tie-breaking between conflicting valid A2 certificates to choose a live successor. Single-winner exposure must come from fact validation: a valid AMP certificate, a valid A3 commit, explicit equivocation evidence, abort evidence, or supersession evidence.

The aura-journal context reducer exposes this state as structured AMP transition reduction data. RelationalState::amp_transitions is keyed by AmpTransitionParentKey { context, channel, parent_epoch, parent_commitment }. Each AmpTransitionReduction records observed, certified, finalized, suppressed, conflict, and emergency evidence ids for that parent. ChannelEpochState::transition points at the current parent transition, and ChannelEpochState::pending_bump is populated only when the current parent is A2Live. Emergency reducer fields include suspect authorities, quarantine successor epochs, and prune-before epochs derived from emergency alarms and emergency transition facts.

The reduced channel view distinguishes:

  • stable_epoch: last durable committed epoch
  • live_successor: optional A2-certified successor for the stable epoch
  • durable_successor: optional A3-finalized successor
  • transition_state: one of the states above
  • conflict_evidence: optional witness or transition evidence when conflict-tainted
  • emergency policy fields such as suspect authority, quarantine epoch, and prune-before epoch when an emergency transition is active

The AMP message ratchet consumes this reducer output. It does not author membership truth, choose between transitions, or repair missing control-plane facts.

7. Flow Budgets

Flow budgets track message sending allowances between authorities. The FlowBudget structure uses semilattice semantics for distributed convergence:

#![allow(unused)]
fn main() {
pub struct FlowBudget {
    pub limit: u64,
    pub spent: u64,
    pub epoch: Epoch,
}

impl FlowBudget {
    pub fn merge(&self, other: &Self) -> Self {
        Self {
            limit: self.limit.min(other.limit),
            spent: self.spent.max(other.spent),
            epoch: if self.epoch.value() >= other.epoch.value() {
                self.epoch
            } else {
                other.epoch
            },
        }
    }
}
}

The spent field uses join-semilattice (max) because charges only increase. The limit field uses meet-semilattice (min) because the most restrictive limit wins. The epoch field advances monotonically. Spent resets on epoch rotation.

Flow budget tracking operates at the runtime layer via FlowBudgetManager. The RelationalState includes a flow_budgets map for CRDT-based replication of budget state across replicas.

8. Receipts and Accountability

Receipts reference the current epoch commitment so reducers can reject stale receipts automatically. The RendezvousReceipt variant in FactContent stores accountability proofs (envelope ID, authority, timestamp, signature). Receipts are stored as relational facts scoped to the emitting context. This coupling ensures that receipt validity follows commitment tree epochs.

9. Snapshots and Garbage Collection

Snapshots summarize all prior facts. A snapshot fact contains a state hash, the list of superseded facts, and a sequence number. A snapshot establishes a high water mark. Facts older than the snapshot can be pruned.

#![allow(unused)]
fn main() {
pub struct SnapshotFact {
    pub state_hash: Hash32,
    pub superseded_facts: Vec<OrderTime>,
    pub sequence: u64,
}
}

Garbage collection removes pruned facts while preserving logical meaning. Pruning does not change the result of reduction. The GC algorithm uses safety margins to prevent premature pruning:

  • Default skip window: 1024 generations
  • Safety margin: skip_window / 2
  • Pruning boundary: max_generation - (2 * skip_window) - safety_margin

Helper functions (compute_checkpoint_pruning_boundary, can_prune_checkpoint, can_prune_proposed_bump) determine what can be safely pruned based on generation boundaries.

10. Journal Effects Integration

The effect system provides journal operations through JournalEffects. This trait handles persistence, merging, and flow budget tracking:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait JournalEffects: Send + Sync {
    async fn merge_facts(&self, target: Journal, delta: Journal) -> Result<Journal, AuraError>;
    async fn refine_caps(&self, target: Journal, refinement: Journal) -> Result<Journal, AuraError>;
    async fn get_journal(&self) -> Result<Journal, AuraError>;
    async fn persist_journal(&self, journal: &Journal) -> Result<(), AuraError>;
    async fn get_flow_budget(&self, context: &ContextId, peer: &AuthorityId) -> Result<FlowBudget, AuraError>;
    async fn update_flow_budget(&self, context: &ContextId, peer: &AuthorityId, budget: &FlowBudget) -> Result<FlowBudget, AuraError>;
    async fn charge_flow_budget(&self, context: &ContextId, peer: &AuthorityId, cost: FlowCost) -> Result<FlowBudget, AuraError>;
}
}

The effect layer writes facts to persistent storage. Replica synchronization loads facts through effect handlers into journal memory. The effect layer guarantees durability but does not affect CRDT merge semantics.

At the choreography runtime boundary, journal coupling is always driven by guard/effect commands. Generated choreography annotations and runtime guard checks emit EffectCommand values, and JournalCoupler ensures journal commits occur before transport observables. In VM execution, AuraVmEffectHandler emits envelopes, but journal writes still flow through the same EffectInterpreter + JournalEffects path.

11. AttestedOp Structure

AttestedOp exists in two layers with different levels of detail:

Layer 1 (aura-core) - Full operation metadata:

#![allow(unused)]
fn main() {
pub struct AttestedOp {
    pub op: TreeOp,
    pub agg_sig: Vec<u8>,
    pub signer_count: u16,
}

pub struct TreeOp {
    pub parent_epoch: Epoch,
    pub parent_commitment: TreeHash32,
    pub op: TreeOpKind,
    pub version: u16,
}
}

Layer 2 (aura-journal) - Flattened for journal storage:

#![allow(unused)]
fn main() {
pub struct AttestedOp {
    pub tree_op: TreeOpKind,
    pub parent_commitment: Hash32,
    pub new_commitment: Hash32,
    pub witness_threshold: u16,
    pub signature: Vec<u8>,
}
}

The aura-core version includes epoch and version for full verification. The aura-journal version includes computed commitments for efficient reduction.

12. Invariants

The journal and reduction architecture satisfy several invariants:

  • Convergence: All replicas reach the same state when they have the same facts
  • Idempotence: Repeated merges or reductions do not change state
  • Determinism: Reduction produces identical output for identical input across all replicas
  • No HashMap iteration: Uses BTreeMap for deterministic ordering
  • No system time: Uses OrderTime tokens for ordering
  • No floating point: All arithmetic is exact

These invariants guarantee correct distributed behavior. They also support offline operation with eventual consistency. They form the foundation for Aura's account and relational context state machines.

13. Fact Validation Pipeline

Every fact inserted into a journal must be validated before merge. The following steps outline the required checks and the effect traits responsible for each fact type:

13.1 AttestedOp Facts

Checks

  • Verify the threshold signature (agg_sig) using the two-phase verification model from aura-core::tree::verification:
    • verify_attested_op(): Cryptographic signature check against BranchSigningKey stored in TreeState
    • check_attested_op(): Full verification plus state consistency (epoch, parent commitment)
  • Ensure the referenced parent state exists locally. Otherwise request missing facts.
  • Confirm the operation is well-formed (e.g., AddLeaf indexes a valid parent node).

See Tree Operation Verification for details on the verify/check model and binding message security.

Responsible Effects

  • CryptoEffects for FROST signature verification via verify_attested_op().
  • JournalEffects for parent lookup, state consistency via check_attested_op(), and conflict detection.
  • StorageEffects to persist the fact once validated.

13.2 Relational Facts

Checks

  • Validate that each authority commitment referenced in the fact matches the current reduced state (AuthorityState::root_commitment).
  • Verify Aura Consensus proofs if present (guardian bindings, recovery grants).
  • Enforce application-specific invariants (e.g., no duplicate guardian bindings).

Responsible Effects

  • AuthorizationEffects / RelationalEffects for context membership checks.
  • CryptoEffects for consensus proof verification.
  • JournalEffects for context-specific merge.

13.3 FlowBudget Facts

Checks

  • Ensure spent deltas are non-negative and reference the active epoch for the (ContextId, peer) pair.
  • Reject facts that would decrease the recorded spent (monotone requirement).
  • Validate receipt signatures associated with the charge (see 111_transport_and_information_flow.md).

Responsible Effects

  • FlowBudgetEffects (or FlowGuard) produce the fact and enforce monotonicity before inserting.
  • JournalEffects gate insertion to prevent stale epochs from updating headroom.

13.4 Snapshot Facts

Checks

  • Confirm the snapshot state_hash matches the hash of all facts below the snapshot.
  • Ensure no newer snapshot already exists for the namespace (check sequence number).
  • Verify that pruning according to the snapshot does not remove facts still referenced by receipts or pending consensus operations.

Responsible Effects

  • JournalEffects compute and validate snapshot digests.
  • StorageEffects persist the snapshot atomically with pruning metadata.

By clearly separating validation responsibilities, runtime authors know which effect handlers must participate before a fact mutation is committed. This structure keeps fact semantics consistent across authorities and contexts.

14. Consistency Metadata Schema

Facts carry consistency metadata for tracking agreement level, propagation status, and acknowledgments. See Operation Categories for the full consistency metadata type definitions.

14.1 Fact Schema Fields

The base Fact structure (section 2) is extended with consistency metadata fields (added via serde defaults for backwards compatibility):

FieldTypePurpose
agreementAgreementFinalization level (A1 Provisional, A2 SoftSafe, A3 Finalized)
propagationPropagationAnti-entropy sync status (Local, Syncing, Complete, Failed)
ack_trackedboolOpt-in flag for per-peer acknowledgment tracking

14.2 Ack Storage Table

For facts with ack_tracked = true, acknowledgments are stored in a separate table:

fact_idpeer_idacked_at
msg-001alice_authority_id2024-01-15T10:30:00Z
msg-001bob_authority_id2024-01-15T10:30:05Z

14.3 Journal API for Consistency

The Journal API provides methods for acknowledgment tracking: record_ack records an acknowledgment from a peer, get_acks retrieves all acknowledgments for a fact, and gc_ack_tracking garbage collects acknowledgment data based on policy.

See Also

Authorization

Overview

Aura authorizes every observable action through Biscuit capability evaluation combined with sovereign policy and flow budgets. The authorization pipeline spans AuthorizationEffects, the guard chain, and receipt accounting. This document describes the data flow and integration points.

Canonical Capability Vocabulary

Aura uses one canonical capability vocabulary based on validated CapabilityName values.

  • First-party capabilities are declared in the crate that owns the behavior, using typed capability families generated from #[capability_family(...)].
  • Token issuance uses explicit grant profiles or explicit validated CapabilityName sets at the issuance boundary. There is no implicit "grant every declared capability" path.
  • Guard snapshots carry evaluated frontiers only. They do not carry declared families, candidate sets, or fallback broad grants.
  • Raw capability strings are admitted only at explicit parsing boundaries such as Biscuit decoding and choreography DSL parsing, where invalid values fail closed.

Out-of-tree modules follow the same shape, but their capability families must be declared in admitted module manifests rather than handwritten in host runtime code.

Biscuit Capability Model

Biscuit tokens encode attenuation chains. Each attenuation step applies additional caveats that shrink authority through meet composition. Aura stores Biscuit material outside the replicated CRDT. Local runtimes evaluate tokens at send time against typed candidate sets supplied by the owning domain and cache the resulting lattice element for the active ContextId.

Cached entries expire on epoch change or when policy revokes a capability. Policy data always participates in the meet. A token can only reduce authority relative to local policy.

flowchart LR
    A[frontier, token] -->|verify signature| B[parsed token]
    B -->|apply caveats| C[frontier ∩ caveats]
    C -->|apply policy| D[result ∩ policy]
    D -->|return| E[Cap element]

This algorithm produces a meet-monotone capability frontier. Step 1 ensures provenance. Steps 2 and 3 ensure evaluation never widens authority. Step 4 feeds the guard chain with a cached outcome.

Guard Chain

Authorization evaluation feeds the transport guard chain. All documents reference this section to avoid divergence.

flowchart LR
    A[Send request] --> B[CapGuard]
    B --> C[FlowGuard]
    C --> D[JournalCoupler]
    D --> E[Transport send]

This diagram shows the guard chain sequence. CapGuard performs Biscuit evaluation. FlowGuard charges the budget. JournalCoupler commits facts before transport.

Guard evaluation is pure and synchronous over a prepared GuardSnapshot. CapGuard reads an evaluated frontier and any inline Biscuit token already present in the snapshot. Snapshot builders may begin with a typed candidate set, but they must evaluate that set against the Biscuit/policy frontier before publishing capabilities into the snapshot. FlowGuard and JournalCoupler emit EffectCommand items rather than executing I/O directly. An async interpreter executes those commands in production or simulation.

Only after all guards pass does transport emit a packet. Any failure returns locally and leaves no observable side effect. DKG payloads require proportional budget charges before any transport send.

Telltale Integration

Aura uses Telltale runtime admission and VM guard checkpoints. Runtime admission gates whether a runtime profile may execute. VM acquire and release guards gate per-session resource leases inside VM execution. The Aura guard chain remains the authoritative policy and accounting path for application sends.

Failure handling is layered. Admission failure rejects engine startup. VM acquire deny blocks the guarded VM action. Aura guard-chain failure denies transport and returns deterministic effect errors.

Runtime Capability Admission

Aura uses a dedicated admission surface for theorem-pack and runtime capability checks before choreography execution. RuntimeCapabilityEffects in aura-core defines capability inventory queries and admission checks. RuntimeCapabilityHandler in aura-effects stores a boot-time immutable capability snapshot. The aura-protocol::admission module declares protocol requirements and maps them to capability keys.

Current protocol capability keys include byzantine_envelope for consensus ceremony admission, termination_bounded for sync epoch-rotation admission, reconfiguration for dynamic topology transfer paths, and mixed_determinism for cross-target mixed lanes.

Execution order is runtime capability admission first, then VM profile gates, then the Aura guard chain. Admission diagnostics must respect Aura privacy constraints. Production runtime paths must not emit plaintext capability inventory events. Admission failures use redacted capability references.

Failure Handling and Caching

Runtimes cache evaluated capability frontiers per context and predicate with an epoch tag. Cache entries invalidate when journal policy facts change or when the epoch rotates.

CapGuard failures return AuthorizationError::Denied without charging flow or touching the journal. FlowGuard failures return FlowError::InsufficientBudget without emitting transport traffic. JournalCoupler failures surface as JournalError::CommitAborted and instruct the protocol to retry after reconciling journal state.

This isolation keeps the guard chain deterministic and side-channel free.

Biscuit Token Workflow

Biscuit tokens guarantee cryptographically verifiable, attenuated delegation chains. Each token carries a signature chain that prevents forgery and supports offline verification without contacting the issuer. Attenuation is monotone: each delegation step can only reduce authority, never widen it. Epoch rotation provides revocation by invalidating old tokens. Aura does not maintain a separate per-token revocation list in aura-authorization; revocation is authority-wide and anchored to the currently trusted root key for that authority.

Issuance is explicit. The runtime selects a reviewed token grant profile and materializes a concrete Vec<CapabilityName> at the issuance boundary. That issuance profile is separate from the evaluated frontier that later appears in guard snapshots. The two must not be conflated: profiles declare what may be granted, while snapshots publish what is currently admitted after Biscuit and policy evaluation.

See Effects and Handlers Guide for Biscuit workflow implementation.

Guard Chain Integration

Biscuit authorization integrates with the guard chain through three phases: cryptographic verification, synchronous guard evaluation over a prepared GuardSnapshot, and effect command interpretation. If any phase fails, the operation returns an error without observable side effects.

Guard operation identifiers are typed guard inputs, not ambient raw strings. Empty or whitespace-only custom operations are rejected before evaluation, and missing authorization metadata is a denial rather than a bypass.

Sync peer-token validation follows the same fail-closed model. Production sync validation requires a configured Biscuit root public key and a concrete authority/operation scope; deterministic roots and dummy scopes are test fixtures only.

See Effects and Handlers Guide for guard chain integration patterns.

Authorization Scenarios

All authorization scenarios (local device operations, cross-authority delegation, API access control, guardian recovery, storage, and relaying) are handled through Biscuit token attenuation and sovereign policy integration. Token scope and restrictions vary by scenario but follow the same meet-monotone evaluation path.

See Effects and Handlers Guide for authorization scenario patterns.

Performance and Caching

Authorization results are cached per authority, token hash, and resource scope with epoch-based invalidation. Cache entries invalidate on epoch rotation or policy update. Signature verification scales with chain length. Datalog evaluation scales with facts times rules. Attenuation is constant-cost.

See Distributed Maintenance Guide for cache configuration.

Security Model

Cryptographic signature verification prevents token forgery. Epoch scoping limits token lifetime and replay attacks. Attenuation preserves security while growing verification cost proportional to chain length. Root key compromise invalidates all derived tokens.

Authority-based ResourceScope prevents cross-authority access. Local sovereign policy integration provides an additional security layer. Guard chain isolation ensures authorization failures leak no sensitive information.

Implementation References

The Cap type in aura-core/src/domain/journal.rs wraps serialized Biscuit tokens with optional root key storage. The Cap::meet() implementation computes capability intersection. Tokens from the same issuer return the more attenuated token. Tokens from different issuers return bottom.

BiscuitAuthorizationBridge in aura-guards/src/authorization.rs handles guard chain integration. TokenAuthority, TokenGrantProfile, and BiscuitTokenManager in aura-authorization/src/biscuit_token.rs handle token creation and attenuation. Capability families live in the owning feature or domain crates, not in one central global enum. ResourceScope in aura-core/src/types/scope.rs defines authority-centric resource patterns.

See Transport and Information Flow for flow budget details. See Journal for fact commit semantics.

Database Architecture

This document specifies the architecture for Aura's query system. The journal is the database. Datalog is the query language. Biscuit provides authorization.

1. Core Principles

1.1 Database-as-Journal Equivalence

Aura's fact-based journal functions as the database. There is no separate database layer. The equivalence maps traditional database concepts to Aura components.

Aura treats database state as a composite of the fact journal and the capability frontier. Query execution always combines the reduced fact state with the current capability lattice (the JournalState composite) to enforce authorization and isolate contexts.

Traditional DatabaseAura Component
TableJournal reduction view
RowFact implementing JoinSemilattice
TransactionAtomic fact append
IndexMerkle trees and Bloom filters
QueryDatalog program evaluation
ReplicationCrdtCoordinator with delta sync

1.2 Authority-First Data Model

Aura's database is partitioned by cryptographic authorities. An AuthorityId owns facts that implement JoinSemilattice. State is derived from those facts.

Data is naturally sharded by authority. Cross-authority operations require explicit choreography. Privacy is the default because no cross-authority visibility exists without permission.

2. Query System

2.1 Query Trait

The Query trait defines typed queries that compile to Datalog:

#![allow(unused)]
fn main() {
pub trait Query: Send + Sync + Clone + 'static {
    /// The result type of this query
    type Result: Clone + Send + Sync + Default + 'static;

    /// Compile this query to a Datalog program
    fn to_datalog(&self) -> DatalogProgram;

    /// Get required Biscuit capabilities for authorization
    fn required_capabilities(&self) -> Vec<QueryCapability>;

    /// Get fact predicates for invalidation tracking
    fn dependencies(&self) -> Vec<FactPredicate>;

    /// Parse Datalog bindings to typed result
    fn parse(bindings: DatalogBindings) -> Result<Self::Result, QueryParseError>;

    /// Unique identifier for caching and subscriptions
    fn query_id(&self) -> String;
}
}

This design separates query definition from execution, enabling:

  • Portable query definitions across runtimes
  • Authorization checking before execution
  • Reactive subscriptions via dependency tracking

2.2 Datalog Types

Queries compile to DatalogProgram, the intermediate representation:

#![allow(unused)]
fn main() {
pub struct DatalogProgram {
    pub rules: Vec<DatalogRule>,
    pub facts: Vec<DatalogFact>,
    pub goal: Option<String>,
}

pub struct DatalogRule {
    pub head: DatalogFact,
    pub body: Vec<DatalogFact>,
}

pub struct DatalogFact {
    pub predicate: String,
    pub args: Vec<DatalogValue>,
}

pub enum DatalogValue {
    String(String),
    Integer(i64),
    Boolean(bool),
    Variable(String),
    Symbol(String),
    Null,
}
}

Programs convert to Datalog source via to_datalog_source():

#![allow(unused)]
fn main() {
let program = DatalogProgram::new(vec![
    DatalogRule::new(DatalogFact::new("active_user", vec![DatalogValue::var("name")]))
        .when(DatalogFact::new("user", vec![DatalogValue::var("name")]))
        .when(DatalogFact::new("online", vec![DatalogValue::var("name")]))
])
.with_goal("active_user($name)");

let source = program.to_datalog_source();
// Output:
// active_user($name) :- user($name), online($name).
// ?- active_user($name).
}

2.3 Fact Predicates

FactPredicate patterns determine query invalidation:

#![allow(unused)]
fn main() {
pub struct FactPredicate {
    /// The predicate name to match
    pub name: String,
    /// Optional positional argument patterns (None = wildcard)
    pub arg_patterns: Vec<Option<String>>,
    /// Named field constraints for structured facts
    pub named_constraints: BTreeMap<String, String>,
}

impl FactPredicate {
    /// Match any fact with the given name
    pub fn named(name: impl Into<String>) -> Self;

    /// Match facts with specific named field constraints
    pub fn with_args(name: impl Into<String>, args: Vec<(&str, &str)>) -> Self;

    /// Add a positional argument pattern
    pub fn with_arg(self, pattern: Option<String>) -> Self;

    /// Add a named field constraint
    pub fn with_named_constraint(self, name: impl Into<String>, value: impl Into<String>) -> Self;

    /// Check if this predicate matches another
    pub fn matches(&self, other: &FactPredicate) -> bool;

    /// Check if this predicate matches a fact with positional arguments
    pub fn matches_fact(&self, fact_name: &str, fact_args: &[String]) -> bool;

    /// Check if this predicate matches a fact with named fields
    pub fn matches_named_fact(&self, fact_name: &str, fact_fields: &BTreeMap<String, String>) -> bool;
}
}

When facts change, subscriptions matching the predicate re-evaluate.

2.4 Query Capabilities

Authorization integrates with Biscuit via QueryCapability:

#![allow(unused)]
fn main() {
pub struct QueryCapability {
    pub resource: String,
    pub action: String,
    pub constraints: Vec<(String, String)>,
}

impl QueryCapability {
    pub fn read(resource: impl Into<String>) -> Self;
    pub fn list(resource: impl Into<String>) -> Self;
    pub fn with_constraint(self, key: impl Into<String>, value: impl Into<String>) -> Self;
    pub fn to_biscuit_check(&self) -> String;
}
}

Capabilities convert to Biscuit checks:

#![allow(unused)]
fn main() {
let cap = QueryCapability::read("channels").with_constraint("owner", "alice");
assert_eq!(
    cap.to_biscuit_check(),
    "check if right(\"channels\", \"read\"), owner == \"alice\""
);
}

3. Query Effects

3.1 QueryEffects Trait

QueryEffects executes queries against the journal:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait QueryEffects: Send + Sync {
    /// Execute a one-shot typed query
    async fn query<Q: Query>(&self, query: &Q) -> Result<Q::Result, QueryError>;

    /// Execute a raw Datalog program
    async fn query_raw(&self, program: &DatalogProgram) -> Result<DatalogBindings, QueryError>;

    /// Subscribe to query updates when facts change
    fn subscribe<Q: Query>(&self, query: &Q) -> QuerySubscription<Q::Result>;

    /// Pre-check authorization
    async fn check_capabilities(&self, capabilities: &[QueryCapability]) -> Result<(), QueryError>;

    /// Trigger re-evaluation for matching subscriptions
    async fn invalidate(&self, predicate: &FactPredicate);
}
}

3.2 Query Execution Flow

flowchart TD
    A[Query::to_datalog] --> B[QueryEffects::query];
    B --> C{Check capabilities};
    C -->|Pass| D[Load journal facts];
    C -->|Fail| E[QueryError::AuthorizationFailed];
    D --> F[Execute Datalog];
    F --> G[Query::parse];
    G --> H[Typed result];

3.3 Reactive Subscriptions

QuerySubscription wraps SignalStream for live updates:

#![allow(unused)]
fn main() {
pub struct QuerySubscription<T: Clone + Send + 'static> {
    stream: SignalStream<T>,
    query_id: String,
}

impl<T: Clone + Send + 'static> QuerySubscription<T> {
    pub fn query_id(&self) -> &str;
    pub fn try_recv(&mut self) -> Option<T>;
    pub async fn recv(&mut self) -> Result<T, QueryError>;
}
}

Usage pattern:

#![allow(unused)]
fn main() {
let mut subscription = effects.subscribe(&ChannelsQuery::default());
while let Ok(channels) = subscription.recv().await {
    println!("Channels updated: {} total", channels.len());
}
}

3.4 Query Isolation

QueryIsolation specifies consistency requirements for queries:

#![allow(unused)]
fn main() {
pub enum QueryIsolation {
    /// See all facts including uncommitted (CRDT state) - fastest
    ReadUncommitted,
    /// Only see facts with consensus commit
    ReadCommitted { wait_for: Vec<ConsensusId> },
    /// Snapshot at specific prestate (time-travel query)
    Snapshot { prestate_hash: Hash32 },
    /// Wait for all pending consensus in scope
    ReadLatest { scope: ResourceScope },
}
}

Usage:

#![allow(unused)]
fn main() {
// Fast query - may see uncommitted facts
let result = effects.query(&ChannelsQuery::default()).await?;

// Wait for specific consensus before querying
let result = effects.query_with_isolation(
    &ChannelsQuery::default(),
    QueryIsolation::ReadCommitted { wait_for: vec![consensus_id] },
).await?;
}

3.5 Query Statistics

QueryStats provides execution metrics:

#![allow(unused)]
fn main() {
pub struct QueryStats {
    pub execution_time: Duration,
    pub facts_scanned: u32,
    pub facts_matched: u32,
    pub cache_hit: bool,
    pub isolation_used: QueryIsolation,
    pub consensus_wait_time: Option<Duration>,
    /// Consistency metadata for matched facts
    pub consistency: ConsistencyMap,
}
}

Usage:

#![allow(unused)]
fn main() {
let (channels, stats) = effects.query_with_stats(&ChannelsQuery::default()).await?;
println!("Query took {:?}, scanned {} facts", stats.execution_time, stats.facts_scanned);
}

3.6 Query Errors

#![allow(unused)]
fn main() {
pub enum QueryError {
    AuthorizationFailed { reason: String },
    MissingCapability { capability: String },
    ExecutionError { reason: String },
    ParseError(QueryParseError),
    SubscriptionNotFound { query_id: String },
    JournalError { reason: String },
    HandlerUnavailable,
    Internal { reason: String },
    ConsensusTimeout { consensus_id: ConsensusId },
    SnapshotNotAvailable { prestate_hash: Hash32 },
    IsolationNotSupported { reason: String },
}
}

4. Concrete Query Examples

Concrete queries (such as ChannelsQuery and MessagesQuery) implement the Query trait by compiling to Datalog programs, declaring required Biscuit capabilities, specifying fact predicate dependencies for invalidation, and parsing typed results from Datalog bindings.

See Effects and Handlers Guide for query implementation examples.

5. Indexing Layer

5.1 IndexedJournalEffects

The IndexedJournalEffects trait provides efficient indexed lookups:

#![allow(unused)]
fn main() {
pub trait IndexedJournalEffects: Send + Sync {
    /// Subscribe to journal fact updates as they are added
    fn watch_facts(&self) -> Box<dyn FactStreamReceiver>;

    /// Get all facts with the given predicate/key
    async fn facts_by_predicate(&self, predicate: &str) -> Result<Vec<IndexedFact>, AuraError>;

    /// Get all facts created by the given authority
    async fn facts_by_authority(&self, authority: &AuthorityId) -> Result<Vec<IndexedFact>, AuraError>;

    /// Get all facts within the given time range (inclusive)
    async fn facts_in_range(&self, start: TimeStamp, end: TimeStamp) -> Result<Vec<IndexedFact>, AuraError>;

    /// Return all indexed facts (append-only view)
    async fn all_facts(&self) -> Result<Vec<IndexedFact>, AuraError>;

    /// Fast membership test using Bloom filter
    fn might_contain(&self, predicate: &str, value: &FactValue) -> bool;

    /// Get the Merkle root commitment for the current index state
    async fn merkle_root(&self) -> Result<[u8; 32], AuraError>;

    /// Verify a fact against the Merkle tree
    async fn verify_fact_inclusion(&self, fact: &IndexedFact) -> Result<bool, AuraError>;

    /// Get the Bloom filter for fast membership tests
    async fn get_bloom_filter(&self) -> Result<BloomFilter, AuraError>;

    /// Get statistics about the index
    async fn index_stats(&self) -> Result<IndexStats, AuraError>;
}
}

The might_contain method uses Bloom filters for fast negative answers with O(1) lookup and less than 1% false positive rate.

5.2 Index Structure

#![allow(unused)]
fn main() {
pub struct AuthorityIndex {
    merkle_tree: MerkleTree<FactHash>,
    predicate_filters: BTreeMap<String, BloomFilter>,
    by_predicate: BTreeMap<String, Vec<FactId>>,
    by_authority: BTreeMap<AuthorityId, Vec<FactId>>,
    by_time: BTreeMap<TimeStamp, Vec<FactId>>,
}
}
  • Merkle trees: Integrity verification
  • Bloom filters: Fast membership tests (<1% false positive rate)
  • B-trees: Ordered lookups (O(log n))

Indexes update on fact commit. Performance target: <10ms for 10k facts.

6. Biscuit Integration

6.1 AuraQuery Wrapper

AuraQuery wraps Biscuit's authorizer for query execution:

#![allow(unused)]
fn main() {
pub struct AuraQuery {
    authorizer: biscuit_auth::Authorizer,
}

impl AuraQuery {
    pub fn add_journal_facts(&mut self, facts: &[Fact]) -> Result<()> {
        for fact in facts {
            self.authorizer.add_fact(fact.to_biscuit_fact()?)?;
        }
        Ok(())
    }

    pub fn query(&self, rule: &str) -> Result<Vec<biscuit_auth::Fact>> {
        self.authorizer.query(rule)
    }
}
}

6.2 Guard Chain Integration

Database operations flow through the guard chain:

flowchart LR
    A[Query Request] --> B[CapGuard];
    B --> C[FlowGuard];
    C --> D[JournalCoupler];
    D --> E[QueryEffects];
  1. CapGuard: Evaluates Biscuit token authorization
  2. FlowGuard: Charges budget for query cost
  3. JournalCoupler: Logs query execution
  4. QueryEffects: Executes the query

Each guard must succeed before the next executes.

7. Transaction Model

7.1 Coordination Matrix

Database operations use two orthogonal dimensions:

Single AuthorityCross-Authority
MonotoneDirect fact insertion (0 RTT)CRDT merge via anti-entropy (0 RTT + sync)
ConsensusSingle-authority consensus (1-2 RTT)Multi-authority consensus (2-3 RTT)

7.2 Examples

  • Monotone + Single: Append message to own channel (journal.insert_fact())
  • Monotone + Cross-Authority: Guardian adds trust fact (journal.insert_relational_fact())
  • Consensus + Single: Remove device from account (consensus_single_shot())
  • Consensus + Cross-Authority: Recovery grant with guardian approval (federated_consensus())

Aura Consensus is not linearizable by default. Each consensus instance independently agrees on a single operation and prestate. To sequence operations, use session types (see MPST and Choreography).

Agreement modes are orthogonal to the coordination matrix: A1 (provisional) and A2 (soft-safe) may provide immediate usability, but any durable shared database state must be A3 (consensus-finalized) with prestate binding. Soft-safe windows should be bounded with convergence certificates and explicit reversion facts.

BFT-DKG integration: When key material is required (K3), the database must bind operations to a consensus-finalized DkgTranscriptCommit. This ensures the transaction prestate and the cryptographic prestate are aligned.

MutationReceipt indicates whether a mutation completed immediately (monotone/CRDT) or was submitted to consensus.

See Choreography Development Guide for mutation receipt patterns and BFT-DKG transaction integration.

8. Temporal Database Model

Aura uses a Datomic-inspired immutable database model where all changes are represented as append-only facts with temporal metadata.

8.1 Core Concepts

Facts are never deleted. They are either:

  • Asserted: Added to a scope
  • Retracted: Marked as no longer valid (but remain queryable in history)
  • Epoch-bumped: Bulk invalidation of facts in a scope
  • Checkpointed: Snapshotted for temporal queries

Facts are organized in hierarchical scopes:

#![allow(unused)]
fn main() {
// Scope path examples
"authority:abc123"                    // Authority-level
"authority:abc123/chat"               // Named sub-scope
"authority:abc123/chat/channel:xyz"   // Typed sub-scope
}

Facts progress through finality levels:

#![allow(unused)]
fn main() {
pub enum Finality {
    Local,                           // Written locally only
    Replicated { ack_count: u16 },   // Acknowledged by N peers
    Checkpointed,                    // In a durable checkpoint
    Consensus { proof: ConsensusId }, // Confirmed via consensus
    Anchored { anchor: AnchorProof }, // External chain anchor
}
}

8.2 Fact Operations

Fact operations are classified by monotonicity. Assert and Checkpoint are monotonic (no coordination required). Retract and EpochBump are non-monotonic (may require consensus). The FactEffects trait provides the write interface, supporting single operations, atomic transactions, finality waiting, and temporal queries.

The hybrid transaction model uses direct apply_op() for simple monotonic operations and explicit Transaction grouping for atomic multi-operation batches. Temporal queries support as-of, since, and history range modes. Per-scope finality configuration controls default and minimum finality levels with optional per-content overrides.

See Effects and Handlers Guide for temporal query patterns and finality configuration.

9. Consistency Metadata

Query results include consistency metadata that tracks the agreement, propagation, and acknowledgment status of each fact.

9.1 ConsistencyMap

The ConsistencyMap type provides per-item consistency status in query results:

#![allow(unused)]
fn main() {
pub struct ConsistencyMap {
    entries: HashMap<String, Consistency>,
}

impl ConsistencyMap {
    pub fn get(&self, id: &str) -> Option<&Consistency>;
    pub fn is_finalized(&self, id: &str) -> bool;
    pub fn acked_by(&self, id: &str) -> Option<&[AckRecord]>;
}
}

9.2 Querying with Consistency

Use query_with_consistency() to get both results and consistency metadata:

#![allow(unused)]
fn main() {
let (messages, consistency) = handler.query_with_consistency(&MessagesQuery::default()).await?;

for msg in &messages {
    let status = if consistency.is_finalized(&msg.id) {
        "finalized"
    } else {
        "pending"
    };
    println!("{}: {}", msg.content, status);
}
}

9.3 QueryStats with Consistency

QueryStats now includes a ConsistencyMap for tracking consistency of scanned facts:

#![allow(unused)]
fn main() {
let (result, stats) = handler.query_with_stats(&query).await?;
if stats.consistency.any_finalized() {
    println!("Some results are finalized");
}
}

9.4 Consistency Dimensions

Each Consistency entry tracks three orthogonal dimensions:

DimensionTypePurpose
AgreementAgreementA1/A2/A3 finalization level
PropagationPropagationGossip/sync status to peers
AcknowledgmentAcknowledgmentPer-peer delivery confirmation

See Operation Categories for full details on these types and their usage.

10. Implementation Location

ComponentLocation
Query traitaura-core/src/query.rs
QueryEffects traitaura-core/src/effects/query.rs
FactEffects traitaura-core/src/effects/fact.rs
QueryIsolationaura-core/src/query.rs
QueryStatsaura-core/src/query.rs
MutationReceiptaura-core/src/query.rs
ConsensusId, FactIdaura-core/src/query.rs
ConsistencyMapaura-core/src/domain/consistency.rs
Agreement, Propagationaura-core/src/domain/
Temporal typesaura-core/src/domain/temporal.rs
ScopeId, Finalityaura-core/src/domain/temporal.rs
FactOp, Transactionaura-core/src/domain/temporal.rs
TemporalQuery, TemporalPointaura-core/src/domain/temporal.rs
AuraQuery wrapperaura-effects/src/database/query.rs
QueryHandleraura-effects/src/query/handler.rs
Concrete queriesaura-app/src/queries/
IndexedJournalEffectsaura-core/src/effects/indexed.rs

See Also

Consensus

This document describes the architecture of Aura Consensus. It defines the problem model, protocol phases, data structures, and integration with journals. It explains how consensus provides single-shot agreement for non-monotone operations such as account updates or relational context operations.

1. Problem Model

Aura uses consensus only for operations that cannot be expressed as monotone growth. Consensus produces a commit fact. The commit fact is inserted into one or more journals and drives deterministic reduction. Aura does not maintain a global log. Consensus operates in the scope of an authority or a relational context.

Consensus is single-shot. It agrees on a single operation and a single prestate. Commit facts are immutable and merge by join in journal namespaces.

A consensus instance uses a context-scoped committee. The committee contains witnesses selected by the authority or relational context. Committee members may be offline. The protocol completes even under partitions.

Consensus finalization is the single source of durable shared state. Fast-path coordination (provisional or soft-safe) may run in parallel for liveness, but its outputs must be superseded by commit facts. For BFT-DKG, consensus finalizes a transcript and emits DkgTranscriptCommit facts.

1.2 Witness Terminology

Aura uses witness (not validator) for consensus attestation participants.

  • witness: attests that an operation is valid for a specific prestate and contributes consensus evidence.
  • signer: contributes threshold signature shares in FROST.

Keeping witness and signer distinct avoids conflating consensus attestation responsibilities with cryptographic share-generation roles.

1.3 Consensus is NOT Linearizable by Default

Aura Consensus is single-shot agreement, not log-based linearization.

Each consensus instance independently agrees on:

  • A single operation
  • A single prestate
  • Produces a single commit fact

Consensus does NOT provide global operation ordering, sequential linearization across instances, or automatic operation dependencies.

To sequence operations, use session types (docs/110_mpst_and_choreography.md) executed through Aura’s Telltale-backed choreography runtime (execute_as runners or VM backend):

#![allow(unused)]
fn main() {
use aura_mpst::{choreography, Role};

#[choreography]
async fn sequential_device_updates<C: EffectContext>(
    ctx: &C,
    account: Role<Account>,
    witnesses: Vec<Role<Witness>>,
) -> Result<(), AuraError> {
    // Session type enforces ordering:
    // 1. Update policy (must complete first)
    let policy_commit = consensus_single_shot(
        ctx,
        account.clone(),
        witnesses.clone(),
        TreeOp::UpdatePolicy { new_policy },
        prestate1.hash(),
    ).await?;

    // 2. Remove device (uses policy_commit as prestate)
    // Session type prevents op2 from starting until op1 completes
    let prestate2 = account.read_tree_state(ctx).await?;
    assert_eq!(prestate2.hash(), policy_commit.result_id.prestate_hash);

    let remove_commit = consensus_single_shot(
        ctx,
        account,
        witnesses,
        TreeOp::RemoveLeaf { target: device_id },
        prestate2.hash(),
    ).await?;

    Ok(())
}
}

Cross-reference: See docs/107_database.md §8 for database transaction integration.

2. Core Protocol

Aura Consensus has two paths. The fast path completes in one round trip. The fallback path uses epidemic gossip and a threshold race. Both paths produce the same commit fact once enough matching witness shares exist.

The fast path uses direct communication. The initiator broadcasts an execute message. Witnesses run the operation against the prestate. Witnesses return FROST shares. The initiator aggregates shares and produces a threshold signature. FROST primitives are in aura-core::crypto::tree_signing.

The fallback path triggers when witnesses disagree or when the initiator stalls. Witnesses exchange share proposals using bounded fanout gossip. Any witness that assembles a valid threshold signature broadcasts a complete commit fact.

3. Common Structures and Notation

Consensus uses the following core concepts and notation throughout the protocol.

Core Variables

  • cid : ConsensusId - consensus instance identifier
  • Op - operation being agreed on (application-defined)
  • prestate - local pre-state for this instance (e.g., journal snapshot)
  • prestate_hash = H(prestate) - hash of prestate
  • rid = H(Op, prestate) - result identifier
  • t - threshold. Adversary controls < t key shares.
  • W - finite set of witnesses for this consensus instance

Per-Instance Tracking

Each consensus participant maintains per-cid state. The decided flags prevent double-voting. Proposals are keyed by (rid, prestate_hash) to group matching shares.

Equivocation Detection

Equivocation occurs when the same witness signs two different rid values under the same (cid, prestate_hash). This violates safety. The protocol detects and excludes equivocating shares.

4. Data Structures

Consensus instances use identifiers for operations and results.

#![allow(unused)]
fn main() {
/// Consensus instance identifier (derived from prestate and operation)
pub struct ConsensusId(pub Hash32);

impl ConsensusId {
    pub fn new(prestate_hash: Hash32, operation_hash: Hash32, nonce: u64) -> Self {
        // Hash of domain separator, prestate, operation, and nonce
    }
}
}

This structure identifies consensus instances. The result is identified by H(Op, prestate) computed inline.

A commit fact contains full consensus evidence including the operation, threshold signature, and participant list. Participants are recorded as AuthorityId values to preserve device privacy.

#![allow(unused)]
fn main() {
/// From crates/aura-consensus/src/types.rs
pub struct CommitFact {
    pub consensus_id: ConsensusId,
    pub prestate_hash: Hash32,
    pub operation_hash: Hash32,
    pub operation_bytes: Vec<u8>,
    pub threshold_signature: ThresholdSignature,
    pub group_public_key: Option<PublicKeyPackage>,
    pub participants: Vec<AuthorityId>,
    pub threshold: u16,
    pub timestamp: ProvenancedTime,
    pub fast_path: bool,
    pub byzantine_attestation: Option<ByzantineSafetyAttestation>,
}
}

The commit fact is the output of consensus. Every peer merges CommitFact into its journal CRDT. A peer finalizes when it accepts a valid threshold signature and inserts the corresponding CommitFact.

The commit fact is inserted into the appropriate journal namespace. This includes account journals for account updates and relational context journals for cross-authority operations.

Ordering of committed facts relies on OrderTime (or session/consensus sequencing), not on timestamps. ProvenancedTime is semantic metadata and must not be used for cross-domain total ordering or indexing.

4.1 Byzantine Admission and Attestation

Consensus-backed ceremonies (including BFT-DKG finalization) are admitted only after runtime capability checks for the consensus profile. Aura currently enforces theorem-pack/runtime capability requirements such as byzantine_envelope before executing DKG and threshold-signing paths.

Today this admission lives in the runtime bridge and threshold-signing services, not in the consensus choreography source. The current consensus choreography source file (crates/aura-consensus/src/protocol/choreography.tell) remains theorem-pack-free because it still models the message skeleton rather than the authoritative admission/evidence profile. Adding a theorem pack there now would be decorative.

Aura should revisit consensus theorem packs only when all of the following are true:

  • the choreography owns a concrete consensus-profile admission surface rather than relying on runtime-local capability checks
  • the required guarantee is better expressed as a choreography-level envelope, ordering, or synchrony contract
  • Aura has a real runtime admission consumer for those choreography-level requirements beyond the existing byzantine_envelope capability gate

At admission, Aura captures a CapabilitySnapshot and records a ByzantineSafetyAttestation with:

  • protocol id (aura.consensus)
  • required capability keys for the profile
  • snapshot of admitted/not-admitted runtime capabilities
  • optional runtime evidence references

The attestation is attached to CommitFact and DkgTranscriptCommit. Capability mismatches fail before ceremony execution and emit redacted capability references for operational diagnostics.

5. Prestate Model

Consensus binds operations to explicit prestates. A prestate hash commits to the current reduced state of all participants.

#![allow(unused)]
fn main() {
let prestate_hash = H(C_auth1, C_auth2, C_context);
}

This value includes root commitments of participating authorities. It may also include the current relational context commitment. Witnesses verify that their local reduced state matches the prestate hash. This prevents forks.

The result identifier binds the operation to the prestate.

#![allow(unused)]
fn main() {
let rid = H(Op, prestate);
}

Witnesses treat matching (rid, prestate_hash) pairs as belonging to the same bucket. The protocol groups shares by these pairs to detect agreement.

6. Fast Path Protocol

The fast path optimistically assumes agreement. It completes in one round trip when all participants share the same prestate.

Messages

  • Execute(cid, Op, prestate_hash, evidΔ) - initiator requests operation execution
  • WitnessShare(cid, rid, share, prestate_hash, evidΔ) - witness returns threshold share
  • Commit(cid, rid, sig, attesters, evidΔ) - initiator broadcasts commit fact
  • StateMismatch(cid, expected_pre_hash, actual_pre_hash, evidΔ) - optional debugging signal

Each message carries evidence delta evidΔ for the consensus instance. Evidence propagates along with protocol messages.

Initiator Protocol

The initiator i coordinates the fast path.

State at initiator i:
    cid       : ConsensusId       // fresh per instance
    Op        : Operation
    shares    : Map[WitnessId -> (rid, share, prestate_hash)]
    decided   : Map[ConsensusId -> Bool]   // initially decided[cid] = false
    W         : Set[WitnessId]
    t         : Nat

The initiator maintains per-witness shares and tracks decision status.

Start Operation

Start(cid, Op):
    prestate       := ReadState()
    prestate_hash  := H(prestate)
    decided[cid]   := false
    shares         := {}

    For all w in W:
        Send Execute(cid, Op, prestate_hash, EvidenceDelta(cid)) to w

The initiator reads local state and broadcasts the operation to all witnesses.

Process Witness Shares

On WitnessShare(cid, rid, share, prestate_hash, evidΔ) from w:
    MergeEvidence(cid, evidΔ)

    If decided[cid] = false and w not in shares:
        shares[w] := (rid, share, prestate_hash)

        // collect all shares for this specific (rid, prestate_hash)
        Hset := { (w', s') in shares | s'.rid = rid
                                     ∧ s'.prestate_hash = prestate_hash }

        If |Hset| ≥ t:
            sig := CombineShares( { s'.share | (_, s') in Hset } )
            attesters := { w' | (w', _) in Hset }

            CommitFact(cid, rid, sig, attesters)
            For all v in W:
                Send Commit(cid, rid, sig, attesters, EvidenceDelta(cid)) to v

            decided[cid] := true

The initiator collects shares. When t shares agree on (rid, prestate_hash), it combines them into a threshold signature.

Witness Protocol

Each witness w responds to execute requests.

State at witness w:
    proposals : Map[(rid, prestate_hash) -> Set[(WitnessId, Share)]]
    decided   : Map[ConsensusId -> Bool]
    timers    : Map[ConsensusId -> TimerHandle]
    W         : Set[WitnessId]
    t         : Nat

Witnesses maintain proposals and fallback timers.

Process Execute Request

On Execute(cid, Op, prestate_hash, evidΔ) from i:
    MergeEvidence(cid, evidΔ)

    if decided.get(cid, false) = true:
        return

    prestate := ReadState()
    if H(prestate) != prestate_hash:
        Send StateMismatch(cid, prestate_hash, H(prestate), EvidenceDelta(cid)) to i
        StartTimer(cid, T_fallback)
        return

    rid   := H(Op, prestate)
    share := ProduceShare(cid, rid)

    proposals[(rid, prestate_hash)] := { (w, share) }

    Send WitnessShare(cid, rid, share, prestate_hash, EvidenceDelta(cid)) to i

    StartTimer(cid, T_fallback)

Witnesses verify prestate agreement before computing shares. Mismatches trigger fallback timers.

Process Commit

On Commit(cid, rid, sig, attesters, evidΔ) from any v:
    if decided.get(cid, false) = false
       and VerifyThresholdSig(rid, sig, attesters):

        MergeEvidence(cid, evidΔ)
        CommitFact(cid, rid, sig, attesters)
        decided[cid] := true
        StopTimer(cid)

Witnesses accept valid commit facts from any source.

7. Evidence Propagation

Evidence tracks equivocation and accountability information for consensus instances. The system uses CRDT-based incremental propagation. Evidence deltas attach to all protocol messages. Witnesses merge incoming evidence automatically.

Equivocation occurs when a witness signs conflicting result IDs for the same prestate. The protocol detects this pattern and generates cryptographic proofs. Each proof records both conflicting signatures with witness identity and timestamp.

#![allow(unused)]
fn main() {
pub struct EquivocationProof {
    pub witness: AuthorityId,
    pub consensus_id: ConsensusId,
    pub prestate_hash: Hash32,
    pub first_result_id: Hash32,
    pub second_result_id: Hash32,
    pub timestamp_ms: u64,
}
}

The proof structure preserves both conflicting votes. This enables accountability and slashing logic in higher layers.

Evidence deltas propagate via incremental synchronization. Each delta contains only new proofs since the last exchange. Timestamps provide watermark-based deduplication. The delta structure is lightweight and merges idempotently.

#![allow(unused)]
fn main() {
pub struct EvidenceDelta {
    pub consensus_id: ConsensusId,
    pub equivocation_proofs: Vec<EquivocationProof>,
    pub timestamp_ms: u64,
}
}

Every consensus message includes an evidence delta field. Coordinators attach deltas when broadcasting execute and commit messages. Witnesses attach deltas when sending shares. Receivers merge incoming deltas into local evidence trackers. This piggybacking ensures evidence propagates without extra round trips.

8. Fallback Protocol

Fallback activates when the fast path stalls. Triggers include witness disagreement or initiator failure. Fallback uses leaderless gossip to complete consensus.

Messages

  • Conflict(cid, conflicts, evidΔ) - initiator seeds fallback with known conflicts
  • AggregateShare(cid, proposals, evidΔ) - witnesses exchange proposal sets
  • ThresholdComplete(cid, rid, sig, attesters, evidΔ) - any witness broadcasts completion

Equivocation Detection

Equivocation means signing two different rid values under the same (cid, prestate_hash).

HasEquivocated(proposals, witness, cid, pre_hash, new_rid):
    For each ((rid, ph), S) in proposals:
        if ph = pre_hash
           and rid ≠ new_rid
           and witness ∈ { w' | (w', _) in S }:
            return true
    return false

The protocol excludes equivocating shares from threshold computation.

Fallback Gossip

Witnesses periodically exchange proposals with random peers.

OnPeriodic(cid) for fallback gossip when decided[cid] = false:
    peers := SampleRandomSubset(W \ {w}, k)

    For each p in peers:
        Send AggregateShare(cid, proposals, EvidenceDelta(cid)) to p

Gossip fanout k controls redundancy. Typical values are 3-5.

Threshold Checking

Witnesses check for threshold completion after each update.

CheckThreshold(cid):
    if decided.get(cid, false) = true:
        return

    For each ((rid, pre_hash), S) in proposals:
        if |S| ≥ t:
            sig := CombineShares({ sh | (_, sh) in S })

            if VerifyThresholdSig(rid, sig):
                attesters := { w' | (w', _) in S }

                CommitFact(cid, rid, sig, attesters)

                For all v in W:
                    Send ThresholdComplete(cid, rid, sig, attesters, EvidenceDelta(cid)) to v

                decided[cid] := true
                return

Any witness reaching threshold broadcasts the commit fact. The first valid threshold signature wins.

9. Integration with Journals

Consensus emits commit facts. Journals merge commit facts using set union. Reduction interprets commit facts as confirmed non-monotone events.

Account journals integrate commit facts that represent tree operations. Relational context journals integrate commit facts that represent guardian bindings or recovery grants.

Reduction remains deterministic. Commit facts simply appear as additional facts in the semilattice.

A commit fact is monotone even when the event it represents is non-monotone. This supports convergence.

9.1 Context Isolation and Guard Integration

Every consensus instance binds to a context identifier. The context provides isolation for capability checking and flow budget enforcement. Authority-scoped ceremonies derive context from authority identity. Relational ceremonies use explicit context identifiers from the relational binding.

The protocol integrates with the guard chain at message send boundaries. Guards evaluate before constructing messages. The guard chain sequences capability checks, flow budget charges, leakage tracking, and journal coupling. Journal facts commit at the runtime bridge layer after guard evaluation completes.

#![allow(unused)]
fn main() {
// Protocol evaluates guards before sending
let guard = SignShareGuard::new(context_id, coordinator);
let guard_result = guard.evaluate(effects).await?;

if !guard_result.authorized {
    return Err(AuraError::permission_denied("Guard denied SignShare"));
}

// Construct and send message after guard approval
let message = ConsensusMessage::SignShare { ... };
}

The dependency injection pattern passes effect systems as parameters. Protocol methods accept generic effect trait bounds. This enables testing with mock effects and production use with real implementations. The pattern avoids storing effects in protocol state.

10. Integration Points

Consensus is used for account tree operations that require strong agreement. Examples include membership changes and policy changes when local signing is insufficient.

Consensus is also used for relational context operations. Guardian bindings use consensus to bind authority state. Recovery grants use consensus to approve account modifications. Application specific relational contexts may also use consensus.

11. FROST Threshold Signatures

Consensus uses FROST to produce threshold signatures. Each witness holds a secret share. Witnesses compute partial signatures. The initiator or fallback proposer aggregates the shares. The final signature verifies under the group public key stored in the commitment tree. See Authority and Identity for details on the TreeState structure.

#![allow(unused)]
fn main() {
/// From crates/aura-consensus/src/messages.rs
pub enum ConsensusMessage {
    SignShare {
        consensus_id: ConsensusId,
        /// The result_id (hash of execution result) this witness computed
        result_id: Hash32,
        share: PartialSignature,
        /// Optional commitment for the next consensus round (pipelining optimization)
        next_commitment: Option<NonceCommitment>,
        /// Epoch for commitment validation
        epoch: Epoch,
    },
    // ... other message variants
}
}

Witness shares validate only for the current consensus instance. They cannot be replayed. Witnesses generate only one share per (consensus_id, prestate_hash).

The attester set in the commit fact contains only devices that contributed signing shares. This provides cryptographic proof of participation.

11.1 Integration with Tree Operation Verification

When consensus produces a commit fact for a tree operation, the resulting AttestedOp is verified using the two-phase model from aura-core::tree::verification:

  1. Verification (verify_attested_op): Cryptographic check against the BranchSigningKey stored in TreeState
  2. Check (check_attested_op): Full verification plus state consistency validation

The binding message includes the group public key to prevent signature reuse attacks. This ensures an attacker cannot substitute a different signing key and replay a captured signature. See Tree Operation Verification for details.

#![allow(unused)]
fn main() {
// Threshold derived from policy at target node
let threshold = state.get_policy(target_node)?.required_signers(child_count);

// Signing key from TreeState
let signing_key = state.get_signing_key(target_node)?;

// Verify against stored key and policy-derived threshold
verify_attested_op(&attested_op, signing_key, threshold, current_epoch)?;
}

The verification step ensures signature authenticity. The check step validates state consistency. Both steps must pass before applying tree operations.

11.2 Type-Safe Share Collection

The implementation provides type-level guarantees for threshold signature aggregation. Share collection uses sealed and unsealed types to prevent combining signatures before reaching threshold. The type system enforces this invariant at compile time.

#![allow(unused)]
fn main() {
pub struct LinearShareSet {
    shares: BTreeMap<AuthorityId, PartialSignature>,
    sealed: bool,
}

pub struct ThresholdShareSet {
    shares: BTreeMap<AuthorityId, PartialSignature>,
}
}

The unsealed type accepts new shares via insertion. When threshold is reached the set seals into the threshold type. Only the threshold type provides the combine method. This prevents calling aggregation before sufficient shares exist.

The current protocol uses hash map based tracking for signature collection. The type-safe approach exists but awaits integration. Future work will replace runtime threshold checks with compile-time proofs. The sealed type pattern eliminates an entire class of bugs.

12. FROST Commitment Pipeline Optimization

The pipelined commitment optimization reduces steady-state consensus from 2 RTT (round-trip times) to 1 RTT by bundling next-round nonce commitments with current-round signature shares.

12.1 Overview

The FROST pipelining optimization improves consensus performance by bundling next-round nonce commitments with current-round signature shares. This allows the coordinator to start the next consensus round immediately without waiting for a separate nonce commitment phase.

Standard FROST Consensus (2 RTT):

  1. Execute → NonceCommit (1 RTT): Coordinator sends execute request, witnesses respond with nonce commitments
  2. SignRequest → SignShare (1 RTT): Coordinator sends aggregated nonces, witnesses respond with signature shares

Pipelined FROST Consensus (1 RTT) (after warm-up):

  1. Execute+SignRequest → SignShare+NextCommitment (1 RTT):
    • Coordinator sends execute request with cached commitments from previous round
    • Witnesses respond with signature share AND next-round nonce commitment

12.2 Core Components

WitnessState (aura-consensus/src/witness.rs) manages persistent nonce state for each witness:

#![allow(unused)]
fn main() {
pub struct WitnessState {
    /// Witness identifier
    witness_id: AuthorityId,

    /// Current epoch to detect when cached commitments become stale
    epoch: Epoch,

    /// Precomputed nonce for the next consensus round
    next_nonce: Option<(NonceCommitment, NonceToken)>,

    /// Active consensus instances this witness is participating in
    active_instances: HashMap<ConsensusId, WitnessInstance>,
}
}

Key methods:

  • get_next_commitment(): Returns cached commitment if valid for current epoch
  • take_nonce(): Consumes cached nonce for use in current round
  • set_next_nonce(): Stores new nonce for future use
  • invalidate(): Clears cached state on epoch change

Message schema updates: The SignShare message now includes optional next-round commitment:

#![allow(unused)]
fn main() {
SignShare {
    consensus_id: ConsensusId,
    share: PartialSignature,
    /// Optional commitment for the next consensus round (pipelining optimization)
    next_commitment: Option<NonceCommitment>,
    /// Epoch for commitment validation
    epoch: Epoch,
}
}

ConsensusProtocol (aura-consensus/src/protocol.rs) integrates the pipelining optimization via FrostConsensusOrchestrator and WitnessTracker. The pipelining logic is distributed across these components rather than isolated in a separate orchestrator.

Key methods:

  • run_consensus(): Determines fast path vs slow path based on cached commitments
  • can_use_fast_path(): Checks if sufficient cached commitments available
  • handle_epoch_change(): Invalidates all cached state on epoch rotation

12.3 Epoch Safety

All cached commitments are bound to epochs to prevent replay attacks:

  1. Epoch Binding: Each commitment is tied to a specific epoch
  2. Automatic Invalidation: Epoch changes invalidate all cached commitments
  3. Validation: Witnesses reject commitments from wrong epochs
#![allow(unused)]
fn main() {
// Epoch change invalidates all cached nonces
if self.epoch != current_epoch {
    self.next_nonce = None;
    self.epoch = current_epoch;
    return None;
}
}

12.4 Fallback Handling

The system gracefully falls back to 2 RTT when:

  1. Insufficient Cached Commitments: Less than threshold witnesses have cached nonces
  2. Epoch Change: All cached commitments become invalid
  3. Witness Failures: Missing or invalid next_commitment in responses
  4. Initial Bootstrap: First round after startup (no cached state)
#![allow(unused)]
fn main() {
if has_quorum {
    // Fast path: 1 RTT using cached commitments
    self.run_fast_path(...)
} else {
    // Slow path: 2 RTT standard consensus
    self.run_slow_path(...)
}
}

12.5 Performance Impact

Latency reduction:

  • Before: 2 RTT per consensus
  • After: 1 RTT per consensus (steady state)
  • Improvement: 50% latency reduction

Message count:

  • Before: 4 messages per witness (Execute, NonceCommit, SignRequest, SignShare)
  • After: 2 messages per witness (Execute+SignRequest, SignShare+NextCommitment)
  • Improvement: 50% message reduction

Trade-offs:

  • Memory: Small overhead for caching one nonce per witness
  • Complexity: Additional state management and epoch tracking
  • Bootstrap: First round still requires 2 RTT

12.6 Implementation Guidelines

Adding pipelining to new consensus operations:

  1. Update message schema: Add next_commitment and epoch fields to response messages
  2. Generate next nonce: During signature generation, also generate next-round nonce
  3. Cache management: Store next nonce in WitnessState for future use
  4. Epoch handling: Always validate epoch before using cached commitments

Example witness implementation:

#![allow(unused)]
fn main() {
pub async fn handle_sign_request<R: RandomEffects + ?Sized>(
    &mut self,
    consensus_id: ConsensusId,
    aggregated_nonces: Vec<NonceCommitment>,
    current_epoch: Epoch,
    random: &R,
) -> Result<ConsensusMessage> {
    // Generate signature share
    let share = self.create_signature_share(consensus_id, aggregated_nonces)?;
    
    // Generate or retrieve next-round commitment
    let next_commitment = if let Some((commitment, _)) = self.witness_state.take_nonce(current_epoch) {
        // Use cached nonce
        Some(commitment)
    } else {
        // Generate fresh nonce
        let (nonces, commitment) = self.generate_nonce(random).await?;
        let token = NonceToken::from(nonces);
        
        // Cache for future
        self.witness_state.set_next_nonce(commitment.clone(), token, current_epoch);
        
        Some(commitment)
    };
    
    Ok(ConsensusMessage::SignShare {
        consensus_id,
        share,
        next_commitment,
        epoch: current_epoch,
    })
}
}

12.7 Security Considerations

  1. Nonce Reuse Prevention: Each nonce is used exactly once and tied to specific epoch
  2. Epoch Isolation: Nonces from different epochs cannot be mixed
  3. Forward Security: Epoch rotation provides natural forward security boundary
  4. Availability: Fallback ensures consensus continues even without optimization

12.8 Testing Strategy

Unit tests:

  • Epoch invalidation logic
  • Nonce caching and retrieval
  • Message serialization with new fields

Integration tests:

  • Fast path vs slow path selection
  • Epoch transition handling
  • Performance measurement

Simulation tests:

  • Network delay impact on 1 RTT vs 2 RTT
  • Behavior under partial failures
  • Convergence properties

12.9 Future Enhancements

  1. Adaptive Thresholds: Dynamically adjust quorum requirements based on cached state
  2. Predictive Caching: Pre-generate multiple rounds of nonces during idle time
  3. Compression: Batch multiple commitments in single message
  4. Cross-Context Optimization: Share cached state across related consensus contexts

13. Fallback Protocol Details

Fallback Trigger

The initiator can proactively trigger fallback when it detects conflicts.

Fallback_Trigger_Initiator(cid):
    // Extract conflicts from shares (same prestate_hash, different rid)
    conflicts := Map[(rid, prestate_hash) -> Set[(w, share)]]

    For each (w, (rid, share, prestate_hash)) in shares:
        conflicts[(rid, prestate_hash)] :=
            conflicts.get((rid, prestate_hash), ∅) ∪ { (w, share) }

    For all w in W:
        Send Conflict(cid, conflicts, EvidenceDelta(cid)) to w

This optimization seeds fallback with known conflicts. Witnesses also start fallback independently on timeout.

Fallback Message Handlers

Witnesses process fallback messages to accumulate shares.

On Conflict(cid, conflicts, evidΔ) from any peer:
    MergeEvidence(cid, evidΔ)

    For each ((rid, pre_hash), S) in conflicts:
        if not HasEquivocatedInSet(proposals, S, pre_hash):
            proposals[(rid, pre_hash)] :=
                proposals.get((rid, pre_hash), ∅) ∪ S

    CheckThreshold(cid)
    Fallback_Start(cid)

Conflict messages bootstrap the fallback phase. The witness merges non-equivocating shares.

On AggregateShare(cid, proposals', evidΔ) from any peer:
    MergeEvidence(cid, evidΔ)

    For each ((rid, pre_hash), S') in proposals':
        For each (w', sh') in S':
            if not HasEquivocated(proposals, w', cid, pre_hash, rid):
                proposals[(rid, pre_hash)] :=
                    proposals.get((rid, pre_hash), ∅) ∪ { (w', sh') }

    CheckThreshold(cid)

Aggregate shares spread proposal sets through gossip. Each witness checks for threshold after updates. Threshold-complete messages carry the final aggregated signature. Once validated, the witness commits and stops its timers.

14. Safety Guarantees

Consensus satisfies agreement. At most one commit fact forms for a given (cid, prestate_hash). The threshold signature ensures authenticity. No attacker can forge a threshold signature without the required number of shares.

Consensus satisfies validity. A commit fact references a result computed from the agreed prestate. All honest witnesses compute identical results. Malformed shares are rejected.

Consensus satisfies deterministic convergence. Evidence merges through CRDT join. All nodes accept the same commit fact.

Formal Verification Status

These properties are formally verified through complementary approaches:

  • Lean 4 Proofs (verification/lean/Aura/Consensus/):

    • Agreement.agreement: Unique commit per consensus instance
    • Validity.validity: Committed values bound to prestates
    • Validity.prestate_binding_unique: Hash collision resistance for prestate binding
    • Equivocation.detection_soundness: Conflicting signatures are detectable
  • Quint Model Checking (verification/quint/protocol_consensus*.qnt):

    • AllInvariants: Combined safety properties pass 1000-sample model checking
    • InvariantByzantineThreshold: Byzantine witnesses bounded below threshold
    • InvariantEquivocationDetected: Equivocation detection correctness
    • InvariantProgressUnderSynchrony: Liveness under partial synchrony

See verification/quint/ and verification/lean/ for verification artifacts and run notes.

Binding Message Security

The binding message for tree operations includes the group public key. This prevents key substitution attacks where an attacker captures a valid signature and replays it with a different key they control. The full binding includes:

  • Domain separator for domain isolation
  • Parent epoch and commitment for replay prevention
  • Protocol version for upgrade safety
  • Current epoch for temporal binding
  • Group public key for signing group binding
  • Serialized operation content

This security property is enforced by compute_binding_message() in the verification module.

15. Liveness Guarantees

The fast path completes in one round trip when the initiator and witnesses are online. The fallback path provides eventual completion under asynchrony. Gossip ensures that share proposals propagate across partitions.

The protocol does not rely on stable leaders. No global epoch boundaries exist. Witnesses can rejoin by merging the journal fact set.

Fallback State Diagram

stateDiagram-v2
    [*] --> FastPath
    FastPath --> FallbackPending: timeout or conflict
    FallbackPending --> FallbackGossip: send Conflict/AggregateShare
    FallbackGossip --> Completed: threshold signature assembled
    FastPath --> Completed: Commit broadcast
    Completed --> [*]
  • FastPath: initiator collecting shares.
  • FallbackPending: timer expired or conflicting (rid, prestate_hash) observed.
  • FallbackGossip: witnesses exchange proposal sets until CheckThreshold succeeds.
  • Completed: commit fact accepted and timers stopped.

Design Trade-offs: Latency vs Availability

The two protocol paths optimize for different objectives:

PropertyFast PathSlow Path
Primary GoalLatency (speed)Availability (robustness)
Completion Time2δ (network speed)Eventual (no tight bound)
Synchrony AssumptionYes (GST reached)No (works during partitions)
CoordinationLeader-driven (initiator)Leaderless gossip
Failure ToleranceRequires all/most onlineTolerates n-k failures

Why leaderless gossip for fallback:

We intentionally sacrifice tight timing bounds (the theoretical 2Δ from optimal protocols) for:

  1. Partition tolerance: Any connected component with k honest witnesses can complete independently. No leader election needed across partitions.

  2. No single point of failure: If the initiator fails, any witness can drive completion. View-based protocols require leader handoff.

  3. Simpler protocol: No view numbers, no leader election, no synchronization barriers. Witnesses simply gossip until threshold is reached.

  4. Natural CRDT integration: Evidence propagates as a CRDT set. Gossip naturally merges evidence without coordination.

What we can prove:

  • Fast path: Commits within 2δ when all witnesses online (verified: TemporalFastPathBound)
  • Slow path: Commits when any connected component has k honest witnesses (not time-bounded)
  • Safety: Always holds regardless of network conditions

What we cannot prove (by design):

  • Slow path completion in fixed time (would require synchrony assumption)
  • Termination under permanent partition (FLP impossibility)

Parameter Guidance

  • T_fallback: set to roughly 2-3 times the median witness RTT. Too aggressive causes unnecessary fallback. Too lax delays recovery when an initiator stalls.
  • Gossip fanout f: see the fanout adequacy analysis below for principled selection.
  • Gossip interval: 250-500 ms is a good default. Ensure at least a few gossip rounds occur before timers retrigger. This gives the network time to converge.

These parameters should be tuned per deployment. The ranges above keep fallback responsive without overwhelming the network.

Fanout Adequacy for Slow Path

The slow path requires the gossip network to be connected for evidence propagation. The key question is how many gossip peers (fanout f) are needed.

Random graph connectivity theory:

For a random graph G(n, p) to be connected with high probability (w.h.p.), the classical Erdős-Rényi result states:

p ≥ (1 + ε) × ln(n) / n    for any ε > 0

Translating to gossip: each node selects f random peers, giving edge probability p ≈ f/n. For connectivity:

f ≥ c × ln(n)    where c ≈ 1.1-1.2 for practical certainty

Recommended fanout by witness count:

Witnesses (n)Minimum Fanout (f)RecommendedNotes
322All must connect (fully connected)
523ln(5) ≈ 1.6, add margin
733-4ln(7) ≈ 1.9
1034ln(10) ≈ 2.3
1544-5ln(15) ≈ 2.7
2145ln(21) ≈ 3.0
5056ln(50) ≈ 3.9

Failure tolerance:

With fanout f and n witnesses, the graph remains connected under random node failures as long as:

  • At least f + 1 nodes remain (sufficient edges for spanning tree)
  • Failed nodes are randomly distributed (not targeted attacks)

For k-of-n threshold where k ≤ n - f:

  • The k honest witnesses form a connected subgraph w.h.p.
  • Each honest witness can reach at least one other honest witness via gossip
  • Evidence propagates to all honest witnesses in O(log(n)) gossip rounds

Adversarial considerations:

Random graph analysis assumes non-adversarial peer selection. Against Byzantine adversaries:

  1. Eclipse attacks: If adversary controls gossip peer selection, connectivity guarantees fail
  2. Mitigation: Use deterministic peer selection based on witness IDs (e.g., consistent hashing)
  3. Stronger bound: For Byzantine tolerance, use f ≥ 2t + 1 where t is Byzantine threshold

Verified properties (Quint):

The following properties are model-checked in protocol_consensus_liveness.qnt:

  • PropertyQuorumSufficient: k honest witnesses in same connected component can progress
  • PropertyPartitionTolerance: largest connected component with k honest succeeds
  • AvailabilityGuarantee: combined availability invariant

With adequate fanout, these properties ensure slow path completion when quorum exists.

16. Relation to FROST

Consensus uses FROST as the final stage of agreement. FROST ensures that only one threshold signature exists per result. FROST signatures provide strong cryptographic proof.

Consensus and FROST remain separate. Consensus orchestrates communication. FROST proves final agreement. Combining the two yields a single commit fact.

17. Operation Categories

Not all operations require consensus. Aura classifies operations into three categories based on their security requirements and execution timing.

17.1 Category A: Optimistic Operations

Operations that can proceed immediately without consensus. These use CRDT facts with eventual consistency.

Characteristics:

  • Immediate local effect
  • Background sync via anti-entropy
  • Failure shows indicator, doesn't block functionality
  • Partial success is acceptable

Examples:

  • Send message (within established context)
  • Create channel (within established relational context)
  • Update channel topic
  • Block/unblock contact
  • Pin message

These operations work because the cryptographic context already exists. Keys derive deterministically from shared journal state. No new agreement is needed.

17.2 Category B: Deferred Operations

Operations that apply locally but require agreement for finalization. Effect is pending until confirmed.

Characteristics:

  • Immediate local effect shown as "pending"
  • Background ceremony for agreement
  • Failure triggers rollback with user notification
  • Multi-moderator operations use this pattern

Examples:

  • Change channel permissions (requires moderator consensus)
  • Remove channel member (may be contested)
  • Transfer channel ownership
  • Rename channel

17.3 Category C: Consensus-Gated Operations

Operations where partial state is dangerous. These block until consensus completes.

Characteristics:

  • Operation does NOT proceed until consensus achieved
  • Partial state would be dangerous or irrecoverable
  • User must wait for confirmation

Examples:

  • Guardian rotation (key shares distributed atomically)
  • Recovery execution (account state replacement)
  • OTA hard fork activation (breaking protocol change)
  • Device revocation (security-critical removal)
  • Add contact / Create group (establishes cryptographic context)
  • Add member to group (changes group encryption keys)

These use Aura Consensus as described in this document.

17.4 Decision Tree

Does this operation establish or modify cryptographic relationships?
│
├─ YES: Does the user need to wait for completion?
│       │
│       ├─ YES (new context, key changes) → Category C (Consensus)
│       │   Examples: add contact, create group, guardian rotation
│       │
│       └─ NO (removal from existing context) → Category B (Deferred)
│           Examples: remove from group, revoke device
│
└─ NO: Does this affect other users' access or policies?
       │
       ├─ YES: Is this high-security or irreversible?
       │       │
       │       ├─ YES → Category B (Deferred)
       │       │   Examples: transfer ownership, delete channel
       │       │
       │       └─ NO → Category A (Optimistic)
       │           Examples: pin message, update topic
       │
       └─ NO → Category A (Optimistic)
           Examples: send message, create channel, block contact

17.5 Key Insight

Ceremonies establish shared cryptographic context. Operations within that context are cheap.

Once a relational context exists (established via Category C invitation ceremony), channels and messages within that context are Category A. The expensive part is establishing WHO is in the relationship. Once established, operations WITHIN the relationship derive keys deterministically from shared state.

The optimistic-path design follows the same effect-policy integration described throughout this document: cheap operations are permitted only after the shared cryptographic context has been established by the ceremony boundary.

18. BFT‑DKG Transcript Finalization (K3)

Consensus is also used to finalize BFT-DKG transcripts. The output of the DKG is a DkgTranscriptCommit fact that is consensus-finalized and then merged into the relevant journal.

Inputs:

  • DkgConfig: epoch, threshold, max_signers, participants, membership_hash.
  • DealerPackage[]: one package per dealer, containing encrypted shares for all participants and a deterministic commitment.
  • prestate_hash and operation_hash: bind the transcript to the authority/context state and the intended ceremony operation.

Transcript formation:

  1. Validate DkgConfig and dealer packages (unique dealers, complete share sets).
  2. Assemble the transcript deterministically.
  3. Hash the transcript using canonical DAG-CBOR encoding.

Finalize with consensus:

  1. Build DkgTranscriptCommit with:
    • transcript_hash
    • blob_ref (optional, if stored out‑of‑line)
    • prestate_hash
    • operation_hash
    • config and participants
  2. Run consensus over the commit fact itself.
  3. Insert both CommitFact evidence and the DkgTranscriptCommit fact into the authority or context journal.

The transcript commit is the single durable artifact used to bootstrap all subsequent threshold operations. Any K3 ceremony that depends on keys must bind to this commit by reference (direct hash or blob ref).

19. Decentralized Coordinator Selection (Lottery)

Coordinator-based fast paths (A2) require a deterministic, decentralized selection mechanism so every participant can independently derive the same leader without extra coordination.

Round seed:

  • round_seed is a 32‑byte value shared by all participants for the round.
  • Sources:
    • VRF output (preferred when available).
    • Trusted oracle or beacon.
    • Initiator‑provided seed (acceptable in trusted settings).

Selection rule:

score_i = H("AURA_COORD_LOTTERY" || round_seed || authority_id_i)
winner = argmin_i score_i

Fencing and safety:

  • The coordinator must hold a monotonic fencing token (coord_epoch).
  • Proposals are rejected if coord_epoch does not advance or if prestate_hash mismatches local state.

Convergence:

  • Coordinators emit a ConvergenceCert once a quorum acks the proposal.
  • Fast-path results remain soft-safe until a consensus CommitFact is merged.

20. Summary

Aura Consensus produces monotone commit facts that represent non-monotone operations. It integrates with journals through set union. It uses FROST threshold signatures and CRDT evidence structures. It provides agreement, validity, and liveness. It supports authority updates and relational operations. It requires no global log and no central coordinator. The helper HasEquivocatedInSet excludes conflict batches that contain conflicting signatures from the same witness. Fallback_Start transitions the local state machine into fallback mode and arms gossip timers. Implementations must provide these utilities alongside the timers described earlier.

See Also

Operation Categories

This document defines the three-tier classification system for distributed operations in Aura. It specifies the ceremony contract for Category C operations, the consistency metadata types for each category, and the decision framework for categorizing new operations. The core insight is that not all operations require consensus. Many can proceed optimistically with background reconciliation.

1. Overview

Operations in Aura fall into three categories based on their effect timing and security requirements.

CategoryNameEffect TimingWhen Used
AOptimisticImmediate local effectLow-risk operations within established contexts
BDeferredPending until confirmedMedium-risk policy/membership changes
CConsensus-GatedBlocked until ceremony completesCryptographic context establishment

Agreement modes are orthogonal to categories. Operations can use provisional or soft-safe fast paths, but any durable shared state must be consensus-finalized (A3). See Consensus for the fast-path and finalization taxonomy.

1.1 Key Generation Methods

Aura separates key generation from agreement:

CodeMethodDescription
K1Single-signerNo DKG required. Local key generation.
K2Dealer-based DKGTrusted coordinator distributes shares.
K3Consensus-finalized DKGBFT-DKG with transcript commit.
DKDDistributed key derivationMulti-party derivation without DKG.

1.2 Agreement Levels

CodeLevelDescription
A1ProvisionalUsable immediately but not final.
A2Coordinator Soft-SafeBounded divergence with convergence certificate.
A3Consensus-FinalizedUnique, durable, non-forkable.

Fast paths (A1/A2) are provisional. Durable shared state must be finalized by A3.

1.3 The Key Architectural Insight

Ceremonies establish shared cryptographic context. Operations within that context are cheap.

Ceremony (Category C)                    Optimistic Operations (Category A)
─────────────────────                    ─────────────────────────────────
• Runs once per relationship             • Within established context
• Establishes ContextId + shared roots   • Derive keys from context
• Creates relational context journal     • Just emit CRDT facts
• All future encryption derives here     • No new agreement needed

2. Category A: Optimistic Operations

Category A operations have immediate local effect via CRDT fact emission. Background sync via anti-entropy propagates facts to peers. Failure shows a status indicator but does not block functionality. Partial success is acceptable.

2.1 Examples

OperationImmediate ActionBackground SyncOn Failure
Create channelShow channel, enable messagingFact syncs to membersShow "unsynced" badge
Send messageDisplay in chat immediatelyDelivery receiptsShow "undelivered" indicator
Add contact (within context)Show in listMutual acknowledgmentShow "pending" status
Block contactHide from view immediatelyPropagate to contextAlready effective locally
Update profileShow changes immediatelyPropagate to contactsShow sync indicator
React to messageShow reactionFact syncsShow "pending"

2.2 Implementation Pattern

#![allow(unused)]
fn main() {
async fn create_channel_optimistic(&mut self, config: ChannelConfig) -> ChannelId {
    let channel_id = ChannelId::derive(&config);

    self.emit_fact(ChatFact::ChannelCheckpoint {
        channel_id,
        epoch: 0,
        base_gen: 0,
        window: 1024,
    }).await;

    channel_id
}
}

This pattern emits a fact into the existing relational context journal. The channel is immediately usable. Key derivation uses KDF(ContextRoot, ChannelId, epoch).

2.3 Why This Works

Category A operations work because encryption keys already exist (derived from established context), facts are CRDTs (eventual consistency is sufficient), no coordination is needed (shared state already agreed upon), and the worst case is delay rather than a security issue.

3. Category B: Deferred Operations

Category B operations have local effect pending until agreement is reached. The UI shows intent immediately with a "pending" indicator. Operations may require approval from capability holders. Automatic rollback occurs on rejection.

3.1 Examples

OperationImmediate ActionAgreement RequiredOn Rejection
Change channel permissionsShow "pending"Moderator approvalRevert, notify
Remove channel memberShow "pending removal"Moderator consensusKeep member
Transfer ownershipShow "pending transfer"Recipient acceptanceCancel transfer
Rename channelShow "pending rename"Member acknowledgmentKeep old name
Archive channelShow "pending archive"Moderator approvalStay active

3.2 Implementation Pattern

#![allow(unused)]
fn main() {
async fn change_permissions_deferred(
    &mut self,
    channel_id: ChannelId,
    changes: PermissionChanges,
) -> ProposalId {
    let proposal = Proposal {
        operation: Operation::ChangePermissions { channel_id, changes },
        requires_approval_from: vec![CapabilityRequirement::Role("moderator")],
        threshold: ApprovalThreshold::Any,
        timeout_ms: 24 * 60 * 60 * 1000,
    };

    let proposal_id = self.emit_proposal(proposal).await;
    proposal_id
}
}

This pattern creates a proposal that does not apply the effect yet. The UI shows "pending" state. The effect applies when threshold approvals are received. Auto-revert occurs on timeout or rejection.

3.3 Approval Thresholds

#![allow(unused)]
fn main() {
pub enum ApprovalThreshold {
    Any,
    Unanimous,
    Threshold { required: u32 },
    Percentage { percent: u8 },
}
}

Any requires any single holder of the required capability. Unanimous requires all holders to approve. Threshold requires k-of-n approval. Percentage requires a percentage of holders.

4. Category C: Consensus-Gated Operations

Category C operations do NOT proceed until a ceremony completes. Partial state would be dangerous or irrecoverable. The user must wait for confirmation. These operations use choreographic protocols with session types, executed through Aura’s Telltale runtime.

4.1 Examples

OperationWhy Blocking RequiredRisk if Optimistic
Add contact (new relationship)Creates cryptographic contextNo shared keys possible
Create groupMulti-party key agreementInconsistent member views
Add member to groupChanges group keysForward secrecy violation
Device enrollmentKey shares distributed atomicallyPartial enrollment unusable
Guardian rotationKey shares distributed atomicallyPartial rotation unusable
Recovery executionAccount state replacementPartial recovery corruption
OTA hard forkScope-bound breaking protocol changeExplicit partition or rejected incompatible sessions outside the cutover scope
Device revocationSecurity-critical removalAttacker acts first

4.2 Implementation Pattern

#![allow(unused)]
fn main() {
async fn add_contact(&mut self, invitation: Invitation) -> Result<ContactId> {
    let ceremony_id = self.ceremony_executor
        .initiate_invitation_ceremony(invitation)
        .await?;

    loop {
        match self.ceremony_executor.get_status(&ceremony_id)? {
            CeremonyStatus::Committed => {
                return Ok(ContactId::from_ceremony(&ceremony_id));
            }
            CeremonyStatus::Aborted { reason } => {
                return Err(AuraError::ceremony_failed(reason));
            }
            _ => {
                tokio::time::sleep(POLL_INTERVAL).await;
            }
        }
    }
}
}

This pattern blocks until the ceremony completes. The user sees progress UI during execution. Context is established only on successful commit.

5. Ceremony Contract

All Category C ceremonies follow a shared contract that ensures atomic commit/abort semantics.

5.1 Ceremony Phases

  1. Compute prestate: Derive a stable prestate hash from the authority/context state being modified. Include the current epoch and effective participant set.

  2. Propose operation: Define the operation being performed. Compute an operation hash bound to the proposal parameters.

  3. Enter pending epoch: Generate new key material at a pending epoch without invalidating the old epoch. Store metadata for commit or rollback.

  4. Collect responses: Send invitations/requests to participants. Participants respond using their full runtimes. Responses must be authenticated and recorded as facts.

  5. Commit or abort: If acceptance/threshold conditions are met, commit the pending epoch and emit resulting facts. Otherwise abort, emit an abort fact with a reason, and leave the prior epoch active.

5.2 Ceremony Properties

All Category C ceremonies implement:

  1. Prestate Binding: CeremonyId = H(prestate_hash, operation_hash, nonce) prevents concurrent ceremonies on same state and ensures exactly-once semantics.

  2. Atomic Commit/Abort: Either fully committed or no effect. No partial state possible.

  3. Epoch Isolation: Uncommitted key packages are inert. No explicit rollback needed on abort.

  4. Session Types: Protocol compliance enforced at compile time via choreographic projection and enforced at runtime via Telltale adapter/VM execution.

5.3 Per-Ceremony Policy Matrix

Authority and Device Ceremonies

CeremonyKey GenAgreementFallbackNotes
Authority bootstrapK1A3NoneLocal, immediate
Device enrollmentK2A1→A2→A3A1/A2Provisional → soft-safe → finalize
Device MFA rotationK3A2→A3A2Consensus-finalized keys
Device removalK3A2→A3A2Remove via rotation

Guardian Ceremonies

CeremonyKey GenAgreementFallbackNotes
Guardian setup/rotationK3A2→A3A2Consensus-finalized for durability
Recovery approvalA2→A3A2Soft-safe approvals → consensus
Recovery executionA2→A3A2Consensus-finalized commit

Channel and Group Ceremonies

CeremonyKey GenAgreementFallbackNotes
AMP channel epoch bumpA1→A2→A3A1/A2Proposed → cert → commit
AMP channel bootstrapA1→A2→A3A1/A2Provisional → group key rotation
Group/Block creationK3A1→A2→A3A1/A2Provisional bootstrap → consensus
Rendezvous secure-channelA1→A2→A3A1/A2Provisional → consensus

Other Ceremonies

CeremonyKey GenAgreementFallbackNotes
Invitation (contact/channel/guardian)A3NoneConsensus-finalized only
OTA activationA2→A3A2Threshold-signed → consensus
DKD ceremonyDKDA2→A3A2Multi-party derivation → commit

5.4 Bootstrap Exception

When creating a new group/channel before the group key ceremony completes, Aura allows a bootstrap epoch using a trusted-dealer key (K2/A1). The dealer distributes a bootstrap key with the channel invite, enabling immediate encrypted messaging. This is explicitly provisional and superseded by the consensus-finalized group key (K3/A3) once the ceremony completes.

5.5 AMP Channel Epoch Transition Refinement

AMP channel epoch transitions use the existing A1 -> A2 -> A3 agreement ladder with an AMP-specific live-state rule. This rule applies only to AMP channel epoch and membership state inside a relational context. It does not weaken authority-root membership, account commitment trees, guardian rotation, or recovery execution.

For a channel transition:

  • A1 observed means a syntactically valid successor proposal exists as a journal fact, but it has no live operational authority.
  • A2Live means exactly one unsuppressed AMP-certified successor exists for the parent epoch and deterministic reduction exposes that successor as the live epoch for AMP send and receive policy.
  • A3Finalized means consensus-backed commit facts durably finalize a successor. The normal path finalizes the same transition_id previously exposed as A2Live.

A2Live is operationally authoritative but non-durable. It may drive live AMP traffic before full A3 finalization, but durable shared state still requires the A3 consensus boundary described in Consensus.

Every AMP transition fact binds a canonical transition_id, a typed digest over:

  • context_id
  • channel_id
  • parent_epoch
  • parent_commitment
  • successor_epoch
  • successor_commitment
  • membership_commitment
  • transition_policy

Proposals, A2 certificates, A3 commits, aborts, conflicts, and supersessions refer to the same transition only when this full identity matches. A different successor for the same parent is a conflicting transition.

The reducer states for a parent transition group are:

StateMeaning
ObservedProposal facts exist, but no live certificate is valid.
A2LiveExactly one valid unsuppressed AMP certificate selects a successor.
A2ConflictConflicting valid certificates or unresolved equivocation evidence suppress live exposure.
A3FinalizedConsensus-backed commit finalizes one successor.
A3ConflictConflicting durable commit evidence suppresses live exposure.
AbortedExplicit abort evidence invalidates the proposal for live use.
SupersededAuthorized supersession replaces an earlier transition path.

The reducer must never choose between conflicting valid A2 certificates by arrival order, local preference, wall-clock time, or deterministic tie-breaking. If facts do not prove a single winner, no live successor is exposed.

AMP transition policies distinguish additive and non-removal transitions from subtractive, removal, revocation, and emergency transitions. Additive transitions may permit bounded dual-epoch receive overlap. Subtractive and emergency transitions require stricter old-epoch acceptance so removed or suspected participants do not retain indefinite send authority.

Emergency AMP transitions are channel-scoped control-plane transitions:

  • EmergencyQuarantineTransition excludes a suspect from the successor epoch, cuts new sends over immediately, minimizes old-epoch receive grace, and rotates or erases old secret material aggressively.
  • EmergencyCryptoshredTransition uses the strongest channel-scoped bar and destroys ordinary pre-emergency readable state at the A2Live cutover boundary.

Emergency channel facts do not automatically remove authority-root membership or suspend recovery/governance rights. Those require separate authority-scoped governance facts and their own thresholds.

6. Consistency Metadata

Each operation category has a purpose-built status type for tracking consistency.

6.1 Core Types

#![allow(unused)]
fn main() {
pub enum Agreement {
    Provisional,
    SoftSafe { cert: Option<ConvergenceCert> },
    Finalized { consensus_id: ConsensusId },
}

pub enum Propagation {
    Local,
    Syncing { peers_reached: u16, peers_known: u16 },
    Complete,
    Failed { retry_at: PhysicalTime, retry_count: u32, error: String },
}

pub struct Acknowledgment {
    pub acked_by: Vec<AckRecord>,
}
}

Agreement indicates the finalization level (A1/A2/A3). Propagation tracks anti-entropy sync status. Acknowledgment tracks explicit per-peer delivery confirmation.

6.2 Category A: OptimisticStatus

#![allow(unused)]
fn main() {
pub struct OptimisticStatus {
    pub agreement: Agreement,
    pub propagation: Propagation,
    pub acknowledgment: Option<Acknowledgment>,
}
}

Use cases include send message, create channel, update profile, and react to message.

UI patterns:

  • Sending: propagation == Local
  • Sent: propagation == Complete
  • ✓✓ Delivered: acknowledgment.count() >= expected.len()
  • Finalized: agreement == Finalized

6.3 Category B: DeferredStatus

#![allow(unused)]
fn main() {
pub struct DeferredStatus {
    pub proposal_id: ProposalId,
    pub state: ProposalState,
    pub approvals: ApprovalProgress,
    pub applied_agreement: Option<Agreement>,
    pub expires_at: PhysicalTime,
}

pub enum ProposalState {
    Pending,
    Approved,
    Rejected { reason: String, by: AuthorityId },
    Expired,
    Superseded { by: ProposalId },
}
}

Use cases include change permissions, remove member, transfer ownership, and archive channel.

6.4 Category C: CeremonyStatus

#![allow(unused)]
fn main() {
pub struct CeremonyStatus {
    pub ceremony_id: CeremonyId,
    pub state: CeremonyState,
    pub responses: Vec<ParticipantResponse>,
    pub prestate_hash: Hash32,
    pub committed_agreement: Option<Agreement>,
}

pub enum CeremonyState {
    Preparing,
    PendingEpoch { pending_epoch: Epoch, required_responses: u16, received_responses: u16 },
    Committing,
    Committed { consensus_id: ConsensusId, committed_at: PhysicalTime },
    Aborted { reason: String, aborted_at: PhysicalTime },
    Superseded { by: CeremonyId, reason: SupersessionReason },
}
}

Use cases include add contact, create group, guardian rotation, device enrollment, and recovery.

When a ceremony commits successfully, committed_agreement is set to Agreement::Finalized with the consensus ID, indicating A3 durability.

6.5 Unified Consistency Type

For cross-category queries and generic handling:

#![allow(unused)]
fn main() {
pub struct Consistency {
    pub category: OperationCategory,
    pub agreement: Agreement,
    pub propagation: Propagation,
    pub acknowledgment: Option<Acknowledgment>,
}

pub enum OperationCategory {
    Optimistic,
    Deferred { proposal_id: ProposalId },
    Ceremony { ceremony_id: CeremonyId },
}
}

7. Ceremony Supersession

When a new ceremony replaces an old one, Aura emits explicit supersession facts that propagate via anti-entropy.

7.1 Supersession Reasons

#![allow(unused)]
fn main() {
pub enum SupersessionReason {
    PrestateStale,
    NewerRequest,
    ExplicitCancel,
    Timeout,
    Precedence,
}
}

PrestateStale indicates the prestate changed while the ceremony was pending. NewerRequest indicates an explicit newer request from the same initiator. ExplicitCancel indicates manual cancellation by an authorized participant. Timeout indicates the ceremony exceeded its validity window. Precedence indicates a concurrent ceremony won via conflict resolution.

7.2 Supersession Facts

Each ceremony fact enum includes a CeremonySuperseded variant:

#![allow(unused)]
fn main() {
CeremonySuperseded {
    superseded_ceremony_id: String,
    superseding_ceremony_id: String,
    reason: String,
    trace_id: Option<String>,
    timestamp_ms: u64,
}
}

7.3 CeremonyTracker API

The CeremonyTracker in aura-agent maintains supersession records for auditability:

MethodPurpose
supersede(old_id, new_id, reason)Record a supersession event
check_supersession_candidates(prestate_hash, op_type)Find stale ceremonies
get_supersession_chain(ceremony_id)Get full supersession history
is_superseded(ceremony_id)Check if ceremony was replaced

Supersession facts propagate via the existing anti-entropy mechanism. Peers receiving a CeremonySuperseded fact update their local ceremony state accordingly.

8. Decision Tree

Use this tree to categorize new operations:

Does this operation establish or modify cryptographic relationships?
│
├─ YES: Does the user need to wait for completion?
│       │
│       ├─ YES (new context, key changes) → Category C (Blocking Ceremony)
│       │   Examples: add contact, create group, guardian rotation
│       │
│       └─ NO (removal from existing context) → Category B (Deferred)
│           Examples: remove from group (epoch rotation in background)
│
└─ NO: Does this affect other users' access or policies?
       │
       ├─ YES: Is this high-security or irreversible?
       │       │
       │       ├─ YES → Category B (Deferred)
       │       │   Examples: transfer ownership, delete channel, kick member
       │       │
       │       └─ NO → Category A (Optimistic)
       │           Examples: pin message, update topic
       │
       └─ NO → Category A (Optimistic)
           Examples: send message, create channel, block contact

9. UI Feedback Patterns

9.1 Category A: Instant Result with Sync Indicator

┌───────────────────────────────────┐
│ You: Hello everyone!         ◆ ✓✓ │  ← Finalized + Delivered
│ You: Check this out            ✓✓ │  ← Delivered (not yet finalized)
│ You: Another thought           ✓  │  ← Sent
│ You: New idea                  ◐  │  ← Sending
└───────────────────────────────────┘

Effect already applied. Indicators show delivery status (◐ → ✓ → ✓✓ → ✓✓ blue) and finalization (◆ appears when A3 consensus achieved).

9.2 Category B: Pending Indicator

┌─────────────────────────────────────────────────────────────────────┐
│ Channel: #project                                                   │
├─────────────────────────────────────────────────────────────────────┤
│ Pending: Remove Carol (waiting for Bob to confirm)                  │
├─────────────────────────────────────────────────────────────────────┤
│ Members:                                                            │
│   Alice (moderator)    ✓                                            │
│   Bob (moderator)      ✓                                            │
│   Carol                ✓  ← Still has access until confirmed        │
└─────────────────────────────────────────────────────────────────────┘

Proposal shown. Effect NOT applied yet.

9.3 Category C: Blocking Wait

┌─────────────────────────────────────────────────────────────────────┐
│                    Adding Bob to group...                           │
│                                                                     │
│    ✓ Invitation sent                                                │
│    ✓ Bob accepted                                                   │
│    ◐ Deriving group keys...                                         │
│    ○ Ready                                                          │
│                                                                     │
│                      [Cancel]                                       │
└─────────────────────────────────────────────────────────────────────┘

User waits. Cannot proceed until ceremony completes.

10. Effect Policy Configuration

Operations use configurable policies that reference the capability system:

#![allow(unused)]
fn main() {
pub struct EffectPolicy {
    pub operation: OperationType,
    pub timing: EffectTiming,
    pub security_level: SecurityLevel,
}

pub enum EffectTiming {
    Immediate,
    Deferred {
        requires_approval_from: Vec<CapabilityRequirement>,
        timeout_ms: u64,
        threshold: ApprovalThreshold,
    },
    Blocking {
        ceremony: CeremonyType,
    },
}
}

10.1 Context-Specific Overrides

Contexts can override default policies:

#![allow(unused)]
fn main() {
// Strict security channel: unanimous moderator approval for kicks
channel.set_effect_policy(RemoveFromChannel, EffectTiming::Deferred {
    requires_approval_from: vec![CapabilityRequirement::Role("moderator")],
    timeout_ms: 48 * 60 * 60 * 1000,
    threshold: ApprovalThreshold::Unanimous,
});

// Casual channel: any moderator can kick immediately
channel.set_effect_policy(RemoveFromChannel, EffectTiming::Immediate);
}

11. Full Operation Matrix

OperationCategoryEffect TimingSecurityNotes
Within Established Context
Send messageAImmediateLowKeys already derived
Create channelAImmediateLowJust facts into context
Update topicAImmediateLowCRDT, last-write-wins
React to messageAImmediateLowLocal expression
Local Authority
Block contactAImmediateLowYour decision
Mute channelAImmediateLowLocal preference
Policy Changes
Change permissionsBDeferredMediumOthers affected
Kick from channelBDeferredMediumAffects access
Archive channelBDeferredLow-MedReversible
High Risk
Transfer ownershipBDeferredHighIrreversible
Delete channelBDeferredHighData loss
Remove from contextBDeferredHighAffects encryption
Cryptographic
Add contactCBlockingCriticalCreates context
Create groupCBlockingCriticalMulti-party keys
Add group memberCBlockingCriticalChanges group keys
Device enrollmentCBlockingCriticalDeviceEnrollment choreography
Guardian rotationCBlockingCriticalKey shares
Recovery executionCBlockingCriticalAccount state
Device revocationCBlockingCriticalSecurity response

12. Common Mistakes to Avoid

Mistake 1: Making Everything Category C

Wrong: "Adding a channel member requires ceremony"

Right: If the member is already in the relational context, it is Category A. Just emit a fact. Only if they need to join the context first is it Category C.

Mistake 2: Forgetting Context Existence

Wrong: Trying to create a channel before establishing relationship

Right: Contact invitation (Category C) must complete before any channel operations (Category A) are possible.

Mistake 3: Optimistic Key Operations

Wrong: "User can start using new guardians while ceremony runs"

Right: Guardian changes affect key shares. Partial state means unusable keys. Must be Category C.

Mistake 4: Blocking on Low-Risk Operations

Wrong: "Wait for all members to confirm before showing channel"

Right: Channel creation is optimistic. Show immediately, sync status later.

See Also

MPST and Choreography

This document describes the architecture of choreographic protocols in Aura. It explains how global protocols are defined, projected, and executed. It defines the structure of local session types, the integration with the Effect System, and the use of guard chains and journal coupling.

1. DSL and Projection

Aura defines global protocols using the tell! macro. The macro parses a global specification into an abstract syntax tree. The macro produces code that represents the protocol as a choreographic structure. The source of truth for protocols is a .tell file stored next to the Rust module that loads it.

Projection converts the global protocol into per-role local session types. Each local session type defines the exact sequence of sends and receives for a single role. Projection eliminates deadlocks and ensures that communication structure is correct.

#![allow(unused)]
fn main() {
tell!(include_str!("example.tell"));
}

Example file: example.tell

module example exposing (Example)

protocol Example =
  roles A, B
  A -> B : Msg(data: Vec<u8>)
  B -> A : Ack(code: u32)

This snippet defines a global protocol with two roles. Projection produces a local type for A and a local type for B. Each local type enforces the required ordering at compile time.

2. Local Session Types

Local session types describe the allowed actions for a role. Each send and receive is represented as a typed operation. Local types prevent protocol misuse by ensuring that nodes follow the projected sequence.

Local session types embed type-level guarantees. These guarantees prevent message ordering errors. They prevent unmatched sends or receives. Each protocol execution must satisfy the session type.

#![allow(unused)]
fn main() {
type A_Local = Send<B, Msg, Receive<B, Ack, End>>;
}

This example shows the projected type for role A. The type describes that A must send Msg to B and then receive Ack.

3. Runtime Integration

Aura executes production choreographies through the Telltale protocol machine. The tell! macro emits the global type, projected local types, role metadata, and composition metadata that the runtime uses to build protocol-machine code images. AuraChoreoEngine in crates/aura-agent/src/runtime/choreo_engine.rs is the production runtime surface.

Aura targets the current public Telltale language/runtime surface directly. Generated code is sourced from .tell files, projected through the public session-type model, and admitted into the protocol machine without an Aura-local runner compatibility layer. For the upstream capability, finalization, semantic-handoff, and runtime-upgrade contract that this runtime model assumes, see Telltale docs/38_capability_model.md.

Generated runners still expose role-specific execution helpers. Aura keeps those helpers for tests, focused migration utilities, and narrow tooling paths. They are not the production execution boundary.

Generated runtime artifacts also carry the data that production startup needs:

  • provide_message for outbound payloads
  • select_branch for choice decisions
  • protocol id and determinism policy reference
  • required capability keys
  • link and delegation constraints
  • operational-envelope selection inputs

These values are sourced from runtime state such as params, journal facts, UI inputs, and manifest-driven admission state.

Aura has one production choreography backend:

  • protocol-machine backend (AuraChoreoEngine) for admitted Telltale runtime execution, replay, and parity checks.

The authoritative async ownership contract for how aura-agent hosts these sessions lives in crates/aura-agent/ARCHITECTURE.md.

That contract is intentionally split:

  • actor services structure the long-lived host runtime
  • move semantics define fragment, session, and endpoint ownership transfer

This distinction matters because delegate is not merely another actor message.

Direct generated-runner execution is test and migration support only.

Production runtime ownership is fragment-scoped. The admitted unit is one protocol fragment derived from the generated CompositionManifest. A manifest without link metadata yields one protocol fragment. A manifest with link metadata yields one fragment per linked bundle.

delegate and link define how ownership moves. Local runtime services claim fragment ownership through AuraEffectSystem. Runtime transfer goes through ReconfigurationManager. The runtime rejects ambiguous local ownership before a transfer reaches the protocol machine.

That rejection is fail-closed. Aura rejects stale or forged owner capabilities, rejects mismatched transfer evidence, and binds timeout expiry to issued timeout witnesses rather than reconstructing expiry from a later host-side elapsed-time guess.

The host runtime may use actor services to supervise the surrounding work, but fragment ownership itself remains a singular move boundary with stale-owner rejection.

Owner record and capability are also distinct here:

  • ownership answers which local runtime currently owns the fragment
  • capability answers which fragment-scoped effects that owner may drive

Delegation must define both the ownership handoff and the capability scope that moves with it.

Host-side async code must preserve that ownership model. External network, timer, and callback work enters through canonical ingress and is routed to the current local owner before any session mutation occurs.

VmBridgeEffects is the synchronous host boundary for one fragment. Protocol-machine callbacks use it for session-local payload queues, blocked receive snapshots, and scheduler signals. Async transport, journal, and storage work stay outside the callback path in the host bridge loop.

4. Choreography Annotations and Effect Commands

Choreographies support annotations that modify runtime behavior. The tell! macro extracts these annotations and generates EffectCommand sequences. This follows the choreography-first architecture where choreographic annotations are the canonical source of truth for guard requirements.

Supported Annotations

AnnotationDescriptionGenerated Effect
guard_capability = "namespace:capability"Canonical capability requirementStoreMetadata (audit trail)
flow_cost = NFlow budget chargeChargeBudget
journal_facts = "fact"Journal fact recordingStoreMetadata (fact key)
journal_merge = trueRequest journal mergeStoreMetadata (merge flag)
audit_log = "event"Audit trail entryStoreMetadata (audit key)
leak = "External"Leakage trackingRecordLeakage

guard_capability is the string boundary for choreography DSL input. The macro parses it into a validated CapabilityName and rejects legacy, unnamespaced, or invalid values at compile time. Outside the DSL boundary, first-party Rust code should use typed capability families rather than hand-written strings.

See Choreography Development Guide for annotation syntax and usage, including protocol artifact requirements, dynamic reconfiguration, protocol evolution compatibility policy, termination budgets, effect command generation, macro output contracts, and effect interpreter integration.

5. Guard Chain Integration

Choreography annotations compile to EffectCommand sequences that feed the same guard chain used at runtime send sites (CapGuard, FlowGuard, JournalCoupler, LeakageTracker). Annotation-derived effects execute first, then runtime guards validate and charge budgets before transport. Guard evaluation is synchronous over a prepared GuardSnapshot and yields EffectCommand items interpreted asynchronously.

See Choreography Development Guide for guard chain integration patterns.

6. Execution Modes

Aura supports multiple execution environments for the same choreography definitions. Production execution uses admitted VM sessions with real effect handlers. Simulation execution uses deterministic time and fault injection. Test utilities may use narrower runner surfaces when that improves isolation.

Each environment preserves the same protocol structure and admission semantics where applicable. Choreography execution also captures conformance artifacts for native/WASM parity testing. See Test Infrastructure Reference for artifact surfaces and effect classification.

7. Example Protocols

Anti-entropy protocols synchronize CRDT state. They run as choreographies that exchange state deltas. Session types ensure that the exchange pattern follows causal and structural rules.

FROST ceremonies use choreographies to coordinate threshold signing. These ceremonies use the guard chain to enforce authorization rules.

Aura Consensus uses choreographic notation for fast path and fallback flows. Consensus choreographies define execute, witness, and commit messages. Session types ensure evidence propagation and correctness.

#![allow(unused)]
fn main() {
tell! {
    #[namespace = "sync"]
    protocol AntiEntropy {
        roles: A, B;
        A -> B: Delta(data: Vec<u8>);
        B -> A: Ack(data: Vec<u8>);
    }
}
}

This anti-entropy example illustrates a minimal synchronization protocol.

8. Operation Categories and Choreography Use

Not all multi-party operations require full choreographic specification. Aura classifies operations into categories that determine when choreography is necessary.

8.1 When to Use Choreography

Full choreography (Category C) is required for operations where partial execution is dangerous and all parties must agree before effects apply -- such as establishing or modifying cryptographic relationships. Operations within established cryptographic contexts (Category A) use CRDT fact emission without choreography. Operations affecting other users' policies (Category B) may use lightweight proposal/approval patterns.

See Choreography Development Guide for the decision framework. See Consensus - Operation Categories for detailed categorization.

9. Choreography Inventory

The codebase contains choreographic protocols spanning core consensus, rendezvous, authentication, recovery, invitation, sync, and runtime coordination -- approximately 15 protocols across 7 domains.

See Project Structure for the protocol inventory with locations and purposes.

10. Runtime Infrastructure

The runtime provides production choreographic execution through manifest-driven Telltale protocol-machine sessions.

10.1 ChoreographicEffects Trait

MethodPurpose
send_to_role_bytesSend message to specific role
receive_from_role_bytesReceive message from specific role
broadcast_bytesBroadcast to all roles
start_sessionInitialize choreography session
end_sessionTerminate choreography session

AuraVmEffectHandler is the synchronous host boundary between the protocol machine and Aura runtime services. AuraQueuedVmBridgeHandler provides queued outbound payloads and branch decisions for role-scoped protocol-machine sessions.

10.2 Wiring a Choreography

Wiring a choreography involves storing the protocol in a .tell file, generating artifacts via tell!, opening an admitted protocol-machine session, and providing decision sources through the host bridge.

See Choreography Development Guide for the wiring procedure.

10.5 Output and Flow Policy Integration Points

Aura binds choreography execution to protocol-machine output/flow gates at the runtime boundary.

AuraVmEffectHandler tags protocol-machine-observable operations with output-condition predicate hints so OutputConditionPolicy can enforce commit visibility rules. The hardening profile allow-list admits only known predicates (transport send/recv, protocol choice/step, guard acquire/release). Unknown predicates are rejected in CI profiles.

Flow constraints are enforced with FlowPolicy::PredicateExpr(...) derived from Aura role/category constraints. This keeps pre-send flow checks aligned with Aura's information-flow contract while preserving deterministic replay behavior.

Practical integration points:

  1. Choreography annotations declare intent (guard_capability, flow_cost, journal_facts, leak).
  2. Macro output emits EffectCommand sequences.
  3. Snapshot builders evaluate typed capability candidates into an admitted frontier, and the guard chain evaluates commands and budgets at send sites.
  4. Protocol-machine output/flow policies gate observable commits and cross-role message flow before transport effects execute.

Choreography-level guard semantics and protocol-machine-level hardening are additive, not competing. Annotations define required effects. Policies constrain which effects are allowed to become observable.

11. Summary

Aura uses choreographic programming to define global protocols. Projection produces local session types. Session types enforce structured communication. Handlers execute protocol steps using effect traits. Extension effects provide authorization, budgeting, and journal updates. Execution modes support testing, simulation, and production. Choreographies define distributed coordination for CRDT sync, FROST signing, and consensus.

Not all multi-party operations need choreography. Operations within established cryptographic contexts use optimistic CRDT facts. Choreography is reserved for Category C operations where partial state would be dangerous.

Transport and Information Flow

This document describes the architecture of transport, guard chains, flow budgets, receipts, and information flow in Aura. It defines the secure channel abstraction and the enforcement mechanisms that regulate message transmission. It explains how context boundaries scope capabilities and budgets.

1. Transport Abstraction

Aura provides a transport layer that delivers payload-encrypted messages between authorities. Each transport connection is represented as a SecureChannel. A secure channel binds a pair of authorities and a context identifier. A secure channel maintains isolation across contexts.

The direct TCP/WebSocket emitters are byte-delivery mechanisms, not a substitute for payload encryption. Production direct transport rejects outbound envelopes unless the envelope either carries explicit aura-payload-encryption metadata with a non-plaintext value or uses an Aura content type that is already defined as encrypted. Test, simulation, and harness transports may bypass this send-time policy so deterministic fixtures can exercise routing without provisioning channel keys.

A secure channel exposes a send operation and a receive operation. The channel manages replay protection and handles connection teardown on epoch changes.

#![allow(unused)]
fn main() {
pub struct SecureChannel {
    pub context: ContextId,
    pub peer: AuthorityId,
    pub channel_id: Uuid,
}
}

This structure identifies a single secure channel. One channel exists per (ContextId, peer) pair. Channel metadata binds the channel to a specific context epoch.

2. Guard Chain

All transport sends pass through the guard chain defined in Authorization. CapGuard evaluates Biscuit capabilities and sovereign policy. FlowGuard charges the per-context flow budget and produces a receipt. JournalCoupler records the accompanying facts atomically. Each stage must succeed before the next stage executes. Guard evaluation runs synchronously over a prepared GuardSnapshot and returns EffectCommand data. An async interpreter executes those commands so guards never perform I/O directly.

3. Flow Budget and Receipts

Flow budgets limit the amount of data that an authority may send within a context. The flow budget model defines a quota for each (ContextId, peer) pair. A reservation system protects against race conditions.

Missing budget state or budget retrieval errors do not create implicit headroom. Zero limits mean no spend is available unless a future explicit typed unlimited-budget policy says otherwise, and guard-time checks must match charge-time semantics.

An authority must reserve budget before sending. A reservation locks a portion of the available budget. The actual charge occurs during the guard chain. If the guard chain succeeds, a receipt is created.

#![allow(unused)]
fn main() {
/// From aura-core/src/types/flow.rs
pub struct Receipt {
    pub ctx: ContextId,
    pub src: AuthorityId,
    pub dst: AuthorityId,
    pub epoch: Epoch,
    pub cost: FlowCost,
    pub nonce: FlowNonce,
    pub prev: Hash32,
    pub sig: ReceiptSig,
}
}

This structure defines a receipt. A receipt binds a cost to a specific context and epoch. The sender signs the receipt. The nonce ensures uniqueness and the prev field chains receipts for auditing. The recipient verifies the signature. Receipts support accountability in multi-hop routing.

4. Information Flow Budgets

Information flow budgets define limits on metadata leakage. Budgets exist for external leakage, neighbor leakage, and group leakage. Each protocol message carries leakage annotations. These annotations specify the cost for each leakage dimension.

Leakage budgets determine if a message can be sent. If the leakage cost exceeds the remaining budget, the message is denied. Enforcement uses padding and batching strategies. Padding hides message size. Batching hides message frequency.

#![allow(unused)]
fn main() {
pub struct LeakageBudget {
    pub external: u32,
    pub neighbor: u32,
    pub in_group: u32,
}
}

This structure defines the leakage budget for a message. Leakage costs reduce the corresponding budget on successful send.

5. Context Integration

Capabilities and flow budgets are scoped to a ContextId. Each secure channel associates all guard decisions with its context. A capability is valid only for the context in which it was issued. A flow budget applies only within the same context.

Derived context keys bind communication identities to the current epoch. When the account epoch changes, all context identities must refresh. All secure channels for the context must be renegotiated.

#![allow(unused)]
fn main() {
pub struct ChannelContext {
    pub context: ContextId,
    pub epoch: u64,
    pub caps: Vec<Capability>,
}
}

This structure defines the active context state for a channel. All guard chain checks use these values.

Bootstrap and stale-node re-entry surfaces fit this model as discovery inputs, not as transport exceptions. Shared bootstrap contact hints, neighborhood re-entry hints, and bounded bootstrap introductions may advertise scoped link endpoints plus expiry and replay bounds. Runtime code can merge those records into local bootstrap selection state, but the transport layer still uses the same guarded Establish, Move, and Hold path objects after a candidate is chosen.

6. Failure Modes and Observability

The guard chain defines three categories of failure. A denial failure occurs when capability requirements are not met. A block failure occurs when a flow budget check fails. A commit failure occurs when journal coupling fails.

Denial failures produce no observable behavior. Block failures also produce no observable behavior. Commit failures prevent sending and produce local error logs. None of these failures result in network traffic.

This design ensures that unauthorized or over-budget sends do not produce side channels.

7. Security Properties

Aura enforces no observable behavior without charge. A message cannot be sent unless flow budget is charged first. Capability gated sends ensure that each message satisfies authorization rules. Receipts provide accountability for multi-hop forwarding.

The network layer does not reveal authority structure. Context identifiers do not reveal membership. All metadata is scoped to individual relationships.

8. Secure Channel Lifecycle

Secure channels follow a lifecycle aligned with rendezvous and epoch semantics.

Establishment begins with rendezvous per Rendezvous Architecture to exchange descriptors inside the Relational Contexts journal. Each descriptor contains transport hints, a handshake PSK derived from the context key, and a punch_nonce. Once both parties receive offer/answer envelopes, they perform Noise IKpsk2 using context-derived keys and establish a QUIC or relay-backed channel bound to (ContextId, peer).

During steady state, the guard chain enforces CapGuard, FlowGuard, and JournalCoupler for every send. FlowBudget receipts created on each hop are inserted into the Relational Contexts journal so downstream peers can audit path compliance.

Bootstrap discovery does not bypass this lifecycle. Broker-backed or board-backed discovery can surface candidates, but once a candidate is selected the resulting channel or movement setup uses the same guarded transport machinery, receipt flow, and context scoping as any other path.

When the account or context epoch changes, the channel detects the mismatch, tears down the existing Noise session, and triggers rendezvous to derive fresh keys. Existing receipts are marked invalid for the new epoch, preventing replay. Channels close explicitly when contexts end or when FlowGuard hits the configured budget limit. Receipts emitted during teardown propagate through the relational context journal so guardians or auditors can verify all hops charged their budgets. Tying establishment and teardown to relational context journals ensures receipts become part of the same fact set tracked in Distributed Maintenance Architecture.

9. Privacy-by-Design Patterns

Privacy-by-design is enforced through context isolation, fixed-size envelopes, and flow budgets as defined in sections 1-7. All messages are scoped to a ContextId or RelationshipId with no cross-context routing. Capability hints are blinded before network transmission. The guard chain ensures unauthorized and over-budget sends produce no network traffic or timing side channels.

See Effects and Handlers Guide for privacy-aware implementation patterns.

10. Sync Status and Delivery Tracking

The system exposes sync status for Category A (optimistic) operations through Propagation state (Local, Syncing, Complete, Failed) and Acknowledgment records per peer. Delivery status is derived from consistency metadata, not stored directly. Read receipts are semantic (user viewed the message) and distinct from transport-level delivery acknowledgments.

Category B operations use proposal/approval state. Category C operations use ceremony completion status. Lifecycle modes (A1/A2/A3) apply within these categories: A1/A2 updates are usable immediately but provisional until A3 consensus finalization.

See Operation Categories for the full consistency metadata type definitions. See Effects and Handlers Guide for delivery tracking patterns.

10.1 Adaptive Privacy Movement

MoveEnvelope is the shared accountable movement boundary for relay traffic, retrieval traffic, held-object deposit and retrieval, accountability replies, and cover traffic where those flows use the adaptive privacy substrate. These flows do not regain separate mailbox, retrieval, relay, or cache-specific transport families.

Movement runs over an explicit path. The path may be a direct established path in passthrough mode, or an anonymous EstablishedPath in privacy mode. Move does not smuggle route setup back into the envelope.

The runtime schedules movement through three classes. Sync-blended traffic rides anti-entropy windows when deadlines allow it. Bounded-deadline replies carry accountability and control traffic that needs shorter latency. Synthetic cover fills the remaining cover floor after application traffic and sync-blended retrieval are counted.

Accountability replies share the same movement substrate, but the first deployment measures them separately. They do not reduce the synthetic cover floor. This prevents mandatory witness traffic from being mistaken for discretionary cover.

10.2 AMP Channel Epoch Data Plane

AMP message acceptance is subordinate to reducer-derived channel epoch state. The ratchet consumes the reduced control-plane view from the relational context journal; it does not decide membership truth or choose among competing successor epochs.

A received AMP message is accepted only when all of the following hold:

  • the message epoch is the stable epoch or the single reducer-exposed A2Live successor epoch
  • the sender is a member of that exact epoch's membership commitment
  • the generation is within the configured skip window or acceptance horizon
  • cryptographic validation succeeds

Dual-epoch acceptance is policy-limited. Additive and non-removal transitions may permit bounded old-epoch receive overlap. Subtractive, removal, revocation, and emergency transitions require stricter old-epoch handling so removed or suspected participants cannot keep sending indefinitely while the network is slow.

Runtime ratchet derivation follows the same rule. Sends use the stable epoch unless the reducer exposes exactly one A2Live successor, in which case sends cut over to that successor. Receives allow stable-plus-successor overlap for additive and ordinary non-removal transitions, reject old-epoch traffic for subtractive and cryptoshred transitions, and allow only minimal old-epoch grace for quarantine transitions. Emergency suspect exclusions are enforced at AMP send boundaries and do not imply authority-root membership changes.

10.3 Emergency AMP Policies

AMP emergency transitions are control-plane epoch transitions, not informal warning messages.

EmergencyQuarantineTransition excludes the suspect from the successor epoch once the A2 certificate makes the successor live. New application sends cut over to the successor epoch immediately. Old-epoch receive grace is minimal and must be explicitly authorized by the transition policy. Implementations erase old sender keys, receiver chain keys, skipped-message key caches, staged epoch material, and channel-adjacent access material aggressively, subject to local retention policy.

EmergencyCryptoshredTransition is the strongest channel-scoped emergency mode. When its successor becomes A2Live, ordinary pre-emergency readable state is destroyed immediately according to local cryptoshredding policy. A3 finalization later durably commits the already-live successor; it does not delay cryptoshredding.

Emergency transitions protect future traffic and reduce future at-rest exposure on honest devices. They do not retroactively protect data already seen, decrypted, or exfiltrated before cutover.

Operator diagnostics must present emergency AMP actions with that limitation: quarantine and cryptoshred reduce future exposure and local readable remnants after the reducer exposes the emergency successor, but they are not an incident response guarantee for content already copied outside honest devices. Cooldown and accusation diagnostics are generation/evidence based; wall-clock timers are operator display metadata only and are not reducer inputs.

Channel emergency facts do not automatically remove authority-root membership or recovery/governance rights. Recovery suspension, governance suspension, and durable structural removal are separate authority-scoped governance actions with their own thresholds.

11. Anti-Entropy Sync Protocol

Anti-entropy implements journal synchronization between peers. The protocol exchanges digests, plans reconciliation, and transfers operations.

11.1 Sync Phases

A sync round begins by loading the local Journal and operation log, then computing a JournalDigest for the local state. The digest is exchanged with the peer. The two digests are compared to determine whether the states are equal, whether one side is behind, or whether they have diverged. Missing operations are then pulled or pushed in batches. Applied operations are converted to a journal delta, merged with the local journal, and persisted once per round.

11.2 Digest Format

#![allow(unused)]
fn main() {
pub struct JournalDigest {
    pub operation_count: u64,
    pub last_epoch: Option<u64>,
    pub operation_hash: Hash32,
    pub fact_hash: Hash32,
    pub caps_hash: Hash32,
}
}

The operation_count is the number of operations in the local op log. The last_epoch is the max parent_epoch observed, or None if the log is empty. The operation_hash is computed by streaming op fingerprints in deterministic order. The fact_hash and caps_hash use canonical serialization (DAG-CBOR) then hash.

11.3 Reconciliation Actions

Digest ComparisonAction
EqualNo-op
LocalBehindRequest missing ops
RemoteBehindPush ops
DivergedPush + pull

Retry behavior follows AntiEntropyConfig.retry_policy with exponential backoff. Failures are reported with structured phase context attributable to a specific phase and peer.

See Choreography Development Guide for anti-entropy implementation.

12. Protocol Version Negotiation

All choreographic protocols participate in version negotiation during connection establishment.

12.1 Version Handshake Flow

sequenceDiagram
    participant I as Initiator
    participant R as Responder
    I->>R: VersionHandshakeRequest(version, min_version, capabilities, nonce)
    R->>I: VersionHandshakeResponse(Accepted/Rejected)
    alt Compatible
        Note over I,R: Use negotiated version
    else Incompatible
        Note over I,R: Disconnect
    end

See Choreography Development Guide for version handshake implementation.

12.2 Handshake Outcomes

OutcomeResponse Contents
Compatiblenegotiated_version (min of both peers), shared capabilities
Incompatiblereason, peer version, optional upgrade_url

12.3 Protocol Capabilities

CapabilityMin VersionDescription
ceremony_supersession1.0.0Ceremony replacement tracking
version_handshake1.0.0Protocol version negotiation
fact_journal1.0.0Fact-based journal sync

13. Summary

The transport, guard chain, and information flow architecture enforces strict control over message transmission. Secure channels bind communication to contexts. Guard chains enforce authorization, budget, and journal updates. Flow budgets and receipts regulate data usage. Leakage budgets reduce metadata exposure. Privacy-by-design patterns ensure minimal metadata exposure and context isolation. All operations remain private to the context and reveal no structural information.

Sync status and delivery tracking provide user visibility into Category A operation propagation. Anti-entropy provides the underlying sync mechanism with digest-based reconciliation. Version negotiation ensures protocol compatibility across peers. Delivery receipts enable message read status for enhanced UX.

Aura Messaging Protocol (AMP)

AMP is a secure asynchronous messaging protocol for Aura. It operates within relational contexts and channels. The protocol provides strong post-compromise security and bounded forward secrecy without head-of-line blocking.

1. Scope and Goals

AMP assumes shared state in joint semilattice journals is canonical. Secrets derive locally from shared state combined with authority keys. Ratchet operations remain deterministic and recoverable.

The protocol targets four properties. No head-of-line blocking occurs during message delivery. Strong post-compromise security restores confidentiality after key exposure. Bounded forward secrecy limits exposure within skip windows. Deterministic recovery enables key rederivation from journal state.

1.1 Design Requirements

AMP addresses constraints that existing messaging protocols cannot satisfy.

Deterministic recovery requires all ratchet state to be derivable from replicated facts. The journal is the only durable state. Secrets derive from reduced journal state combined with authority keys. No device may maintain ratchet state that cannot be recovered by other devices in the authority.

Multi-device authorities require support for concurrent message sends from different devices within the same authority. External parties cannot observe which device sent a message. All devices must converge to the same ratchet position after merging facts.

Selective consistency requires both eventual consistency for message sends and strong agreement for durable epoch transitions. The protocol cannot assume a central ordering service.

Authorization integration requires every message send to pass through a guard chain before reaching the network. Authorization failure must prevent key derivation. Budget charges must be atomic with the send.

1.2 Optimistic Operations Within Contexts

Channels are encrypted substreams within an existing relational context. The relational context, established via invitation ceremony, already provides the shared secret foundation. No new key agreement is needed. Channel creation is local: a device emits a ChannelCheckpoint fact into the existing context journal. Members must already share the context.

Channel creation and messaging are Category A operations. Both parties already derived the context root via the invitation ceremony. Channel facts sync via normal journal anti-entropy. Key derivation is deterministic: KDF(ContextRoot, ChannelId, epoch) produces the ChannelBaseKey. The cryptographic foundation was established when the relationship was created. See Operation Categories for the full classification.

1.2.1 Bootstrap Messaging (Dealer Key, Provisional)

For newly-created channels where the group key ceremony has not completed yet, AMP supports a bootstrap epoch using a trusted-dealer key (K2/A1). This enables participants to exchange encrypted messages immediately without waiting for consensus-finalized group keys.

The dealer generates a bootstrap key (K_boot) during channel creation and emits an AmpChannelBootstrap fact recording the bootstrap metadata but not the key itself. The dealer distributes K_boot per invitee over a Noise IKpsk2 channel established through rendezvous, authenticated by identity keys and a PSK derived from the relational context. Participants store K_boot locally in secure storage keyed by (context, channel, bootstrap_id). Messages in epoch 0 use K_boot for encryption and decryption.

Messages are encrypted but confidentiality is provisional and dealer-trusted. K_boot never appears in the journal. Only the bootstrap_id hash is recorded. Late joiners do not receive K_boot and therefore cannot read bootstrap-epoch messages.

After the group key ceremony completes, the channel bumps to the next epoch. Messages in epoch 1 and later use the canonical group key derivation. Bootstrap messages remain decryptable only to members who stored K_boot.

1.3 AMP Lifecycle (A1/A2/A3)

AMP channel epoch transitions use the A1 -> A2 -> A3 agreement ladder with an AMP-specific live-state rule. This rule applies only to AMP channel epoch and membership state inside a relational context. It does not modify authority-root membership, account commitment trees, device enrollment, guardian rotation, or recovery execution.

LevelAMP meaningSend authority
A1A syntactically valid AmpProposedChannelEpochBump fact exists.No successor-epoch sends.
A2Exactly one valid unsuppressed AmpCertifiedChannelEpochBump exists for the parent epoch.Successor is live for AMP send and receive policy.
A3Consensus-backed finalization selected one transition as durable.Successor is durable and live.

The normal path finalizes the same transition_id that was previously exposed as A2 live. See Operation Categories for the agreement taxonomy and Consensus for consensus evidence binding.

2. Channel Lifecycle Surface

The AmpChannelEffects trait defines the canonical API for AMP channel lifecycle and messaging. This trait lives in aura-core::effects::amp.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait AmpChannelEffects: Send + Sync {
    async fn create_channel(&self, params: ChannelCreateParams) -> Result<ChannelId, AmpChannelError>;
    async fn close_channel(&self, params: ChannelCloseParams) -> Result<(), AmpChannelError>;
    async fn join_channel(&self, params: ChannelJoinParams) -> Result<(), AmpChannelError>;
    async fn leave_channel(&self, params: ChannelLeaveParams) -> Result<(), AmpChannelError>;
    async fn send_message(&self, params: ChannelSendParams) -> Result<AmpCiphertext, AmpChannelError>;
}
}

The create_channel method writes an AMP checkpoint and policy for a context-scoped channel. The close_channel method records a terminal epoch bump and policy closure. The join_channel and leave_channel methods record membership facts. The send_message method derives current channel state and returns AmpCiphertext containing the header and encrypted payload.

2.1 Implementations

The runtime implementation uses AmpChannelCoordinator in aura-protocol::amp::channel_lifecycle. The simulator implementation uses SimAmpChannels in aura-simulator::amp. The testkit provides MockEffects implementing AmpChannelEffects for deterministic unit tests.

3. Terminology

3.1 Aura Terms

An authority is an account authority with a commitment tree and FROST keys. A relational context is shared state between authorities identified by ContextId. A journal is a CRDT OR-set of facts with monotone growth and deterministic reduction.

3.2 AMP Terms

A channel is a messaging substream scoped to a relational context. The channel epoch bounds post-compromise security and serves as the KDF base. The ratchet generation is a monotone position derived from reduced journal state. The skip window defines out-of-order tolerance with a default of 1024 generations.

A checkpoint is a journal fact anchoring ratchet windows. The alternating ratchet maintains two overlapping windows at boundaries. A transition proposal is an observed request to move from a parent epoch to a successor epoch.

An AMP transition identity binds context_id, channel_id, parent_epoch, parent_commitment, successor_epoch, successor_commitment, membership_commitment, and transition_policy. The typed digest over this tuple is the transition_id. All proposal, certificate, commit, abort, conflict, and supersession facts for the same transition must bind the same transition_id.

An A2 certificate is a witness-backed AmpCertifiedChannelEpochBump fact. An A3 finalization is an AmpFinalizedChannelEpochBump fact backed by consensus evidence. A live successor is the single successor exposed by deterministic reduction for current AMP traffic. A durable successor is the successor finalized by A3 evidence.

4. Commitment Tree Integration

The commitment tree defines authority structure and provides the foundation for key derivation. Each authority maintains an internal tree with branch nodes and leaf nodes. Leaf nodes represent devices holding threshold signing shares. Branch nodes represent subpolicies expressed as m-of-n thresholds.

#![allow(unused)]
fn main() {
pub struct TreeState {
    pub epoch: Epoch,
    pub root_commitment: TreeHash32,
    pub branches: BTreeMap<NodeIndex, BranchNode>,
    pub leaves: BTreeMap<LeafId, LeafNode>,
}
}

The TreeState represents the result of reducing all operations in the OpLog. The epoch field scopes all derived keys. The root_commitment field is a Merkle hash over the ordered tree structure. External parties see only the epoch and root commitment.

Tree operations modify device membership and policies. The AddLeaf operation inserts a new device. The RemoveLeaf operation removes an existing device. The ChangePolicy operation updates threshold requirements. The RotateEpoch operation increments the epoch and invalidates all derived keys.

Each operation appears in the journal as an attested operation signed by the required threshold of devices. The tree supports concurrent updates through deterministic conflict resolution.

Multiple Concurrent Ops → Group by Parent Commitment → Select Winner by Hash Order → Apply in Epoch Order → Single TreeState

The channel base key for a given ChannelId and epoch derives as KDF(TreeRoot, ChannelId, epoch). All devices in the authority compute the same base key.

5. Fact Structures

AMP uses facts inserted into the relational context journal. All facts are monotone. Reduction determines canonical state.

5.1 Channel Checkpoint

#![allow(unused)]
fn main() {
pub struct ChannelCheckpoint {
    pub context: ContextId,
    pub channel: ChannelId,
    pub chan_epoch: u64,
    pub base_gen: u64,
    pub window: u32,
    pub ck_commitment: Hash32,
    pub skip_window_override: Option<u32>,
}
}

Reduction chooses one canonical checkpoint per context, channel, and epoch tuple. The valid ratchet generation set is the union of two windows. Window A spans from base_gen to base_gen + window. Window B spans from base_gen + window + 1 to base_gen + 2 * window. Checkpoints enable deterministic recovery and serve as garbage collection anchors.

5.2 Channel Epoch Transition Facts

#![allow(unused)]
fn main() {
pub struct ProposedChannelEpochBump {
    pub context: ContextId,
    pub channel: ChannelId,
    pub parent_epoch: u64,
    pub new_epoch: u64,
    pub bump_id: Hash32,
    pub reason: ChannelBumpReason,
    pub parent_commitment: Hash32,
    pub successor_commitment: Hash32,
    pub membership_commitment: Hash32,
    pub transition_policy: AmpTransitionPolicy,
    pub transition_id: Hash32,
}
}

AmpProposedChannelEpochBump records A1 observed state. It is useful evidence for operators and reducers. It does not authorize successor-epoch sends by itself.

#![allow(unused)]
fn main() {
pub struct CertifiedChannelEpochBump {
    pub identity: AmpTransitionIdentity,
    pub transition_id: Hash32,
    pub witness_payload_digest: Hash32,
    pub committee_digest: Hash32,
    pub threshold: u16,
    pub fault_bound: u16,
    pub witness_signatures: Vec<AmpTransitionWitnessSignature>,
    pub equivocation_refs: BTreeSet<Hash32>,
    pub excluded_authorities: BTreeSet<AuthorityId>,
    pub readable_state_destroyed: bool,
}
}

AmpCertifiedChannelEpochBump records A2 soft-safe evidence. The certificate binds the exact parent prestate, successor epoch, successor membership, transition policy, witness committee, threshold, and witness signatures. If it is the only valid unsuppressed certificate for the parent epoch, reduction exposes the successor as live.

#![allow(unused)]
fn main() {
pub struct CommittedChannelEpochBump {
    pub context: ContextId,
    pub channel: ChannelId,
    pub parent_epoch: u64,
    pub new_epoch: u64,
    pub chosen_bump_id: Hash32,
    pub consensus_id: Hash32,
    pub transcript_ref: Option<Hash32>,
}
}

AmpCommittedChannelEpochBump is the legacy committed-bump shape. It remains part of the protocol fact set for compatibility and recovery fixtures. New transition-aware reducers use the canonical transition fields when they are present.

#![allow(unused)]
fn main() {
pub struct FinalizedChannelEpochBump {
    pub identity: AmpTransitionIdentity,
    pub transition_id: Hash32,
    pub consensus_id: Hash32,
    pub transcript_ref: Option<Hash32>,
    pub excluded_authorities: BTreeSet<AuthorityId>,
    pub readable_state_destroyed: bool,
}
}

AmpFinalizedChannelEpochBump records A3 durability for one transition. The transcript_ref optionally links the transition to a DKG transcript for key ceremony coordination.

5.3 Transition Evidence Facts

AMP transition reduction also consumes AmpTransitionAbort, AmpTransitionConflict, AmpTransitionSupersession, and AmpEmergencyAlarm facts. These facts suppress unsafe live exposure, record equivocation evidence, authorize replacement paths, and surface emergency suspicion without changing authority-root membership.

Abort and supersession facts include an AmpTransitionSuppressionScope. The A2LiveOnly scope suppresses live use but does not by itself reject later A3 evidence. The A2AndA3 scope suppresses both live use and durable finalization for the affected transition.

5.4 Channel Bump Reason and Transition Policy

#![allow(unused)]
fn main() {
pub enum ChannelBumpReason {
    Routine,
    SuspiciousActivity,
    ConfirmedCompromise,
}
}

The Routine variant indicates cadence-based maintenance. The SuspiciousActivity variant indicates detected anomalies such as AEAD failures or ratchet conflicts. The ConfirmedCompromise variant requires immediate post-compromise security restoration. Both suspicious activity and confirmed compromise bypass routine spacing rules.

AmpTransitionPolicy refines these reasons into NormalTransition, AdditiveTransition, SubtractiveTransition, EmergencyQuarantineTransition, and EmergencyCryptoshredTransition. The policy determines old-epoch receive overlap, suspect exclusion, and cryptoshred behavior after an A2 live successor is exposed.

6. Derived Channel State

Reduction yields a ChannelEpochState struct containing the stable epoch, optional live transition, bootstrap metadata, and ratchet position.

#![allow(unused)]
fn main() {
pub struct ChannelEpochState {
    pub chan_epoch: u64,
    pub pending_bump: Option<PendingBump>,
    pub bootstrap: Option<ChannelBootstrap>,
    pub last_checkpoint_gen: u64,
    pub current_gen: u64,
    pub skip_window: u32,
    pub transition: Option<AmpTransitionReduction>,
}
}

chan_epoch is the stable reduced epoch. pending_bump is derived only when exactly one unsuppressed A2 certificate exposes a live successor. It is not derived from proposal order or local preference.

#![allow(unused)]
fn main() {
pub enum AmpTransitionReductionStatus {
    Observed,
    A2Live,
    A2Conflict,
    A3Finalized,
    A3Conflict,
    Aborted,
    Superseded,
}
}

The status describes the reducer result for one parent prestate. Observed has no live successor. A2Live exposes one live successor. A3Finalized exposes one durable successor. Conflict, abort, and supersession states suppress live use unless explicit replacement evidence selects a valid successor.

#![allow(unused)]
fn main() {
pub struct PendingBump {
    pub parent_epoch: u64,
    pub new_epoch: u64,
    pub bump_id: Hash32,
    pub reason: ChannelBumpReason,
    pub transition_id: Hash32,
    pub transition_policy: AmpTransitionPolicy,
}
}

PendingBump is the compatibility projection for the live A2 successor. The richer transition reduction remains available through RelationalState::amp_transitions and ChannelEpochState::transition. See Journal for the reducer contract.

6.1 Skip Window Computation

The skip window derives from three sources in priority order. The checkpoint skip_window_override takes precedence. The channel policy fact applies if no override exists. The default of 1024 applies otherwise. The skip window size and bump cadence are governable via context policy facts.

6.2 Critical Invariants

Before sending, a device must merge latest facts, reduce channel state, and use updated epoch and generation values. No device may send under stale epochs. A proposal alone never authorizes successor-epoch sends.

For each parent epoch and parent commitment, reduction must expose at most one live successor. If two valid A2 certificates conflict and no fact resolves the conflict, the reducer exposes no live successor. Reducers must not choose a winner by arrival order, local preference, wall-clock time, network connectivity, or hash tie-breaking.

A3 finalization may make the same transition durable. If A3 evidence conflicts with already-live A2 evidence, the reducer must surface conflict state rather than silently selecting one path. Emergency transition facts remain channel-scoped and do not alter authority-root membership.

6.3 Ratchet Generation Semantics

The ratchet_gen value is not a local counter. It derives from reduced journal state. All devices converge to the same ratchet position after merging facts. Generation advances only when send or receive events occur consistent with checkpoint and dual-window rules. This guarantees deterministic recovery and prevents drift across devices.

7. Three-Level Key Architecture

AMP separates key evolution across three levels. Authority epochs provide identity-level post-compromise security. Channel epochs provide channel-level post-compromise security. Ratchet generations provide bounded forward secrecy within each epoch.

7.1 Authority Epochs

Authority epochs rotate the threshold key shares via DKG ceremony. Rotation invalidates all derived context keys. This is the strongest form of post-compromise security.

Authority epoch rotation occurs approximately daily or upon confirmed compromise. The rotation is independent of messaging activity. All relational contexts must re-derive their shared secrets after rotation.

7.2 Channel Epochs

Channel epochs provide post-compromise security at the channel level. Each epoch uses an independent base key derived from the tree root. Epoch rotation invalidates all keys from the previous epoch. The rotation is atomic. All devices observe the same epoch after merging journal facts.

The epoch transition lifecycle has observed, live, durable, conflict, abort, and supersession states. The stable epoch remains the durable baseline until reduction exposes a single live successor or a single durable successor. This single-live-successor invariant ensures linear AMP traffic for each parent epoch.

7.3 Ratchet Generations

Ratchet generations provide forward secrecy within each epoch. Message keys derive from the channel base key, generation, and direction.

#![allow(unused)]
fn main() {
pub struct AmpHeader {
    pub context: ContextId,
    pub channel: ChannelId,
    pub chan_epoch: u64,
    pub ratchet_gen: u64,
}
}

The AmpHeader contains the routing and ratchet information. The chan_epoch field identifies which base key to use. The ratchet_gen field identifies which generation key to derive. This header becomes the AEAD additional data.

Ratchet generation is not a local counter. It derives from reduced journal state. Devices compute the current generation by examining send and receive events in the reduced context state. All devices converge to the same ratchet position after merging facts.

8. Channel Epoch Bump Lifecycle

8.1 Reducer States

A channel has a stable epoch plus reducer-derived transition state. Observed means proposals exist but do not affect send authority. A2Live means one certified successor is live for AMP traffic. A3Finalized means one successor is durable.

A2Conflict and A3Conflict expose no live successor. Aborted suppresses the affected transition. Superseded records an authorized replacement path and derives live state from the replacement only when the replacement has valid evidence.

8.2 Spacing Rule for Routine Bumps

AMP enforces a spacing rule for routine bumps. Let base_gen be the anchor generation from the canonical checkpoint. Let current_gen be the current ratchet generation. Let W be the skip window.

A routine bump from epoch e to e+1 requires current_gen - base_gen >= W / 2. With default W of 1024, this threshold equals 512 generations. This spacing ensures structural transitions do not occur too frequently.

Emergency transitions bypass this rule. Examples include multiple AEAD failures, conflicting ratchet commitments, unexpected epoch values, and explicit compromise signals.

8.3 State Transitions

In stable state, if the spacing rule is satisfied and no transition is live, a device may insert AmpProposedChannelEpochBump. The channel remains on the stable epoch until an A2 certificate or A3 finalization becomes reducer-visible.

When the reducer observes exactly one valid unsuppressed AmpCertifiedChannelEpochBump, the successor becomes A2Live. New sends cut over to the successor epoch. Receives follow the transition policy for stable-plus-successor overlap.

When the reducer observes valid AmpFinalizedChannelEpochBump evidence for one successor, the successor becomes durable. The device may emit a new checkpoint for the successor epoch. Conflict, abort, and supersession evidence can suppress live use or durable use according to its scope.

8.4 Complete Lifecycle Sequence

The lifecycle has four common phases: channel creation, normal messaging, routine transition, and emergency transition.

Channel creation establishes the initial checkpoint at epoch 0.

sequenceDiagram
    participant A as Device A
    participant J as Journal
    participant B as Device B
    A->>J: ChannelCheckpoint(epoch=0, base_gen=0, W=1024)
    J-->>B: Sync checkpoint fact
    B->>B: Reduce state to epoch 0, gen 0

This sequence records the initial checkpoint. The checkpoint anchors recovery and the first dual ratchet window.

sequenceDiagram
    participant A as Device A
    participant B as Device B
    loop Messages within Window A [0..1024]
        A->>A: Merge facts, reduce state
        A->>A: Derive key(epoch=0, gen=N)
        A->>B: AmpHeader(epoch=0, gen=N) + ciphertext
        B->>B: Validate gen in [0..2048]
        B->>B: Derive key, decrypt
        B->>B: Advance receive ratchet
    end

Normal messaging advances the ratchet generation within dual windows.

sequenceDiagram
    participant A as Device A
    participant J as Journal
    participant B as Device B
    A->>A: Check: current_gen - base_gen >= W/2
    A->>J: AmpProposedChannelEpochBump(0 to 1, policy=NormalTransition)
    J-->>B: Sync proposal fact
    B->>B: Reduce state to Observed
    A->>J: AmpCertifiedChannelEpochBump(transition_id=T)
    J-->>B: Sync certificate fact
    A->>A: Reduce state to A2Live(T)
    B->>B: Reduce state to A2Live(T)
    rect rgb(240, 240, 255)
        Note over A,B: Transition policy controls old-epoch receive overlap
        A->>B: Messages use epoch 1
        B->>B: Accept stable or live successor epoch by policy
    end
    A->>J: AmpFinalizedChannelEpochBump(transition_id=T)
    J-->>B: Sync finalization fact
    A->>A: Reduce state to A3Finalized(T)
    B->>B: Reduce state to A3Finalized(T)
    A->>J: ChannelCheckpoint(epoch=1, base_gen=1024)

This sequence shows the key distinction between live and durable state. The A2 certificate can drive traffic before A3 finalization. A3 finalization later commits the same transition as durable.

Conflicting certificates suppress live use until facts prove a single winner.

sequenceDiagram
    participant A as Device A
    participant J as Journal
    participant B as Device B
    A->>J: AmpCertifiedChannelEpochBump(transition_id=T1)
    B->>J: AmpCertifiedChannelEpochBump(transition_id=T2)
    J-->>A: Sync conflicting certificates
    J-->>B: Sync conflicting certificates
    A->>A: Reduce state to A2Conflict
    B->>B: Reduce state to A2Conflict
    A->>A: Keep sends on stable epoch
    B->>B: Keep sends on stable epoch

This sequence prevents silent fork selection. The reducer must not choose between conflicting valid certificates without explicit abort, conflict, supersession, or finalization evidence.

Emergency transitions bypass spacing for post-compromise containment.

sequenceDiagram
    participant B as Device B
    participant J as Journal
    participant A as Device A
    B->>B: Detect AEAD failures > threshold
    B->>J: AmpEmergencyAlarm(suspect=S)
    B->>J: AmpProposedChannelEpochBump(1 to 2, policy=EmergencyQuarantineTransition)
    Note right of B: Bypasses spacing rule
    B->>J: AmpCertifiedChannelEpochBump(transition_id=E)
    J-->>A: Sync emergency certificate
    A->>A: Reduce state to A2Live(E)
    B->>B: Reduce state to A2Live(E)
    A->>A: Exclude suspect from successor send policy
    B->>B: Erase old material by emergency policy

Emergency quarantine excludes the suspect from the successor epoch once A2 evidence is live. Emergency cryptoshred also destroys ordinary pre-emergency readable state at the A2 live boundary. A3 finalization confirms durability but does not delay the emergency cutover.

9. Ratchet Windows

AMP uses an always-dual window model. Every checkpoint defines two consecutive skip windows providing a continuous valid range of 2W generations.

9.1 Window Layout

Given base generation G and skip window W, two windows are defined. Window A spans G to G+W. Window B spans G+W+1 to G+2W. The valid generation set is the union of both windows.

#![allow(unused)]
fn main() {
// Window layout for checkpoint at generation G with window W
let window_a = G..(G + W);
let window_b = (G + W + 1)..(G + 2*W);
let valid_gen_set = window_a.union(window_b);
}

Window A [G .. G+W] → Window B [G+W+1 .. G+2W]

This design eliminates boundary issues. No mode switches are required. The implementation remains simple with robust asynchronous tolerance.

9.2 Window Shifting

When a new checkpoint is issued, the new base generation is chosen far enough ahead per the spacing rule. The dual-window layout guarantees overlap with prior windows. Garbage collection can safely prune older checkpoints when they no longer affect valid generation ranges.

9.3 Asynchronous Delivery

The dual window design solves the asynchronous delivery problem. Messages may arrive out of order by up to W generations. During additive and ordinary non-removal transitions, messages may be accepted under either the stable epoch or the single reducer-exposed live successor epoch.

Message derivation uses a KDF chain similar to Signal's construction but with key differences. AMP derives all ratchet state deterministically from replicated journal facts rather than device-local databases. This enables complete recovery across multiple devices without coordination.

Subtractive, revocation, quarantine, and cryptoshred transitions use stricter old-epoch handling. Removed or suspected authorities must not retain indefinite send authority while the network is slow. See Transport and Information Flow for the data-plane acceptance contract.

10. Sending Messages

10.1 Message Header

#![allow(unused)]
fn main() {
pub struct AmpHeader {
    pub context: ContextId,
    pub channel: ChannelId,
    pub chan_epoch: u64,
    pub ratchet_gen: u64,
}
}

The header contains the context identifier, channel identifier, current epoch, and current generation. These fields form the additional authenticated data for AEAD encryption.

10.2 Reduce-Before-Send Rule

Before sending a message, a device must merge the latest journal facts. It must reduce the channel state to get the stable epoch, live successor, and generation. It must verify that the generation is within the valid window. Only then can it derive the message key and encrypt.

10.3 Send Procedure

Before sending, a device merges new facts and reduces channel state. It asserts that the current generation is within the valid generation set. It chooses the stable epoch unless reduction exposes exactly one A2Live or A3Finalized successor.

If the spacing rule is satisfied and no transition is live, the device may propose a new transition. The proposal is not send authority. The device derives the message key using a KDF with the channel base key, generation, and direction.

The device creates an AmpHeader, encrypts the payload with AEAD using the message key and header as additional data, and performs the guard chain. Guard evaluation runs over a prepared GuardSnapshot and emits EffectCommand items for async interpretation. The device then advances the local ratchet generation.

Emergency policies add sender membership checks at the send boundary. A suspect excluded by a live emergency successor cannot keep sending under the old epoch through local ratchet state.

11. Receiving Messages

When receiving, a device merges new facts and reduces channel state. It checks that the message epoch is the stable epoch or the single reducer-exposed successor epoch. It checks that the sender belongs to the membership commitment for that exact epoch. It checks that the generation is within the valid generation set.

The device rederives the message key using the same KDF parameters. It decrypts the payload using AEAD with the message key and header. It advances the local receive ratchet.

Messages outside valid windows or with unsupported epochs are rejected. Subtractive and cryptoshred transitions reject old-epoch traffic more aggressively than additive transitions. Quarantine transitions allow only minimal old-epoch grace when the transition policy explicitly permits it.

12. Guard Chain Integration

AMP integrates with Aura's guard chain for authorization and flow control. Every message send passes through CapabilityGuard, FlowBudgetGuard, JournalCouplingGuard, and LeakageTrackingGuard before reaching the transport.

Guard evaluation runs over a prepared GuardSnapshot and produces EffectCommand items. An async interpreter executes those commands only after the entire guard chain succeeds. Unauthorized or over-budget sends never touch the network.

#![allow(unused)]
fn main() {
let outcome = guard_chain.evaluate(&snapshot, &request);

if !outcome.is_authorized() {
    return Err(AuraError::authorization_failed("Guard denied"));
}

for cmd in outcome.effects {
    interpreter.execute(cmd).await?;
}
}

The snapshot captures the current capability frontier and flow budget state. Each guard emits EffectCommand items instead of performing I/O. The interpreter executes the resulting commands in production or simulation. Any failure returns locally without observable effects.

12.1 Flow Budgets

Flow budgets are replicated as spent counters in the journal. The spent counter for a context and peer pair is a monotone fact. The limit computes at runtime from Biscuit capabilities and sovereign policy. Before sending, the flow budget guard checks that spent + cost <= limit.

12.2 Receipts

Receipts provide accountability for multi-hop forwarding. Each relay hop produces a receipt containing the context, source, destination, epoch, cost, and signature. The receipt proves that the relay charged its budget before forwarding.

13. Recovery and Garbage Collection

13.1 Recovery

To recover, a device loads the relational context journal. It reduces to the latest stable epoch, any reducer-exposed live successor, the latest checkpoint, and the skip window. It rederives the channel base key from context root key, channel identifier, and selected epoch. It rederives ratchet state from the checkpoint and window generations.

Load journal facts -> reduce to TreeState -> reduce to ChannelEpochState -> compute checkpoint -> derive base key -> ready to message

The recovery process requires no coordination. The device does not need to contact other participants. It does not need to request missing state. It only needs access to the journal facts. Once reduction completes, the device has the same view as all other participants.

13.2 Garbage Collection

AMP garbage collection maintains protocol safety while reclaiming storage. GC operates on checkpoints, transition proposals, certificates, finalizations, aborts, conflicts, supersessions, and emergency alarms.

A checkpoint at generation G with window W can be pruned when a newer checkpoint exists at G' where G' exceeds G+2W. The newer checkpoint coverage must not overlap with the old checkpoint coverage. All messages within the old window must be processed or beyond the recovery horizon.

A transition proposal can be pruned when finalization, abort, or supersession evidence makes it stale. A2 certificates and A3 finalizations are retained as evidence until full snapshot compaction. Conflict evidence must remain available while it can affect live successor exposure.

13.3 Pruning Boundary

The safe pruning boundary for checkpoints is computed as the maximum checkpoint generation minus 2W minus a safety margin. The recommended safety margin is W/2 or 512 generations with default settings.

GC triggers when journal size exceeds threshold, checkpoint count exceeds maximum, manual compaction is requested, or snapshot creation initiates. See Distributed Maintenance Architecture for integration with the Aura snapshot system.

14. Security Properties

14.1 Forward Secrecy

Forward secrecy is bounded by the skip window size of 1024 within each epoch. Within a single epoch, an attacker who compromises a device learns at most W future message keys. Older epochs use independent base keys. Compromise of epoch e reveals nothing about epoch e-1.

14.2 Post-Compromise Security

Post-compromise security operates at two levels. Channel epoch bumps heal compromise of channel secret state. Context epoch bumps provide stronger PCS by healing compromise of context state. Channel bumps are frequent and cheap. Context bumps are rare and expensive.

The dual window mechanism provides continuous post-compromise healing. When an emergency transition becomes A2 live, the new epoch uses a fresh base key. Messages sent under the new epoch cannot be decrypted with the old base key. Emergency cryptoshred destroys ordinary pre-emergency readable state according to local retention policy.

Emergency transitions protect future traffic and reduce future readable remnants on honest devices. They do not retroactively protect data already seen, decrypted, or exfiltrated before cutover.

14.3 No Head-of-Line Blocking

The protocol accepts out-of-order messages up to W generations. Dual windows absorb boundary drift.

14.4 No Ratchet Forks

Deterministic reduction exposes at most one live successor for each parent epoch. A3 consensus finalizes durability. Conflicting valid A2 certificates suppress live exposure rather than creating two ratchet branches.

14.5 Deterministic Recovery

Checkpoints anchor the ratchet. Facts represent all shared state. Keys are always rederived deterministically. This property enables true multi-device messaging. All devices in an authority see the same messages. All devices can send and receive without coordinating who is currently active. The authority appears as a single messaging entity to external parties.

15. Failure Modes

AMP drops messages only when they fall outside the cryptographically safe envelope. These failures are intentional safeguards.

15.1 Generation Out of Window

A message is dropped if its generation lies outside the valid generation set. Causes include stale sender state, aged messages post-GC, or invalid generations from attackers.

15.2 Epoch Mismatch

Messages are rejected when the header epoch is inconsistent with reduced epoch state. This includes epochs that are too old, non-linear epochs, or replays from retired epochs.

15.3 AEAD Failure

If AEAD decryption fails, messages are dropped. Repeated failures contribute to suspicious-event classification and may trigger AmpEmergencyAlarm and emergency transition proposal facts.

15.4 Beyond Recovery Horizon

Messages older than the current checkpoint window cannot be decrypted because older checkpoints have been garbage collected. This is an intentional forward secrecy tradeoff.

15.5 Policy-Enforced Invalidity

After context epoch bumps or high-severity PCS events, messages from previous context epochs are intentionally dropped. After subtractive, quarantine, or cryptoshred transitions, messages from old channel epochs may also be rejected by transition policy.

16. Message Delivery Tracking

AMP integrates with Aura's consistency metadata system for delivery tracking. The transport layer supports opt-in acknowledgment tracking through FactOptions. Peers send FactAck responses upon successful processing. Acks are stored in the journal's ack table and garbage-collected when the delivery policy determines they are no longer needed.

Read receipts are distinct from delivery acknowledgments. Delivery indicates the message was received and decrypted. Read indicates the user viewed the message. Read receipts are opt-in per contact and emit a ChatFact::MessageRead fact.

Delivery acknowledgments leak timing metadata. Applications should batch acknowledgments to reduce timing precision. Read receipts should default to disabled. High-sensitivity contexts can disable acks entirely. See Operation Categories for consistency type definitions and User Interface for status indicators.

See Also

Rendezvous Architecture

This document describes the rendezvous architecture in Aura. It explains peer discovery, descriptor propagation, service-surface advertisement, connectivity selection, channel establishment, and relay-to-direct holepunch upgrades. It aligns with the authority and context model. It scopes all rendezvous behavior to relational contexts.

1. Overview

Rendezvous establishes secure channels between authorities. The RendezvousService exposes prepare_publish_descriptor() and prepare_establish_channel() methods. The service returns guard outcomes that the caller executes through an effect interpreter. Rendezvous operates inside a relational context and uses the context key for encryption. Descriptors appear as facts in the context journal. Propagation uses journal synchronization (aura-sync), not custom flooding.

Rendezvous owns descriptor semantics, publication, validation, and establish bootstrap. It does not own the long-lived mutable descriptor cache. The runtime in aura-agent owns that cache and passes descriptor snapshots into peer-discovery views.

Rendezvous does not establish global identity. All operations are scoped to a ContextId. A context defines which authorities may see descriptors. Only participating authorities have the keys required to decrypt descriptor payloads.

Rendezvous descriptor exchange can run in provisional (A1) or soft-safe (A2) modes to enable rapid connectivity under poor network conditions, but durable channel epochs and membership changes must be finalized via consensus (A3). Soft-safe flows should emit convergence and reversion facts so participants can reason about reversion risk while channels are warming.

Bootstrap discovery is a separate concern. First-run startup may surface same-machine or same-LAN bootstrap candidates before any enrollment or acceptance has completed. Those candidates are not ordinary rendezvous peers. They must not inflate ordinary peer counts or become context-scoped descriptor participants before enrollment or acceptance completes.

Browser runtimes cannot join the native UDP LAN path. Browser-involved startup uses a bootstrap broker path that publishes ephemeral bootstrap descriptors and then hands control back to the existing invitation or device-enrollment flow.

1.1 Secure‑Channel Lifecycle (A1/A2/A3)

  • A1: Provisional Descriptor / handshake facts allow immediate connectivity.
  • A2: Coordinator soft‑safe convergence certs indicate bounded divergence.
  • A3: ChannelEstablished is finalized by consensus and accompanied by a CommitFact.

Rendezvous must treat A1/A2 outputs as provisional until A3 evidence is merged. See Consensus for commit evidence binding.

2. Architecture

The rendezvous crate follows Aura's fact-based architecture:

  1. Guard Chain First: All network sends flow through guard evaluation before execution
  2. Facts Not Flooding: Descriptors are journal facts propagated via aura-sync, not custom flooding
  3. Standard Receipts: Uses the system Receipt type with epoch binding and cost tracking
  4. Session-Typed Protocol: Protocols expressed as MPST choreographies with guard annotations
  5. Unified Transport: Channels established via SecureChannel with Noise IKpsk2

2.1 Module Structure

aura-rendezvous/
├── src/
│   ├── lib.rs           # Public exports
│   ├── facts.rs         # RendezvousFact domain fact type
│   ├── protocol.rs      # MPST choreography definition
│   ├── service.rs       # RendezvousService (main coordinator)
│   ├── descriptor.rs    # Transport selector and builder
│   └── new_channel.rs   # SecureChannel, ChannelManager, Handshaker

3. Connectivity and Service Advertisement

Rendezvous descriptors carry two kinds of information. One surface describes concrete connectivity endpoints. The other surface describes abstract service families such as Establish, Move, and Hold. These surfaces must remain separate.

Connectivity endpoints describe how a peer may be reached. Service advertisements describe what the peer is willing to provide. Runtime policy combines both surfaces with local permit state, health, and trust evidence. Descriptor publication itself does not commit the final route choice.

The implementation still derives split connectivity and service-surface views from legacy TransportHint compatibility data in some descriptor and invitation paths. That compatibility bridge is quarantined. Owner: adaptive_privacy_runtime. Removal condition: descriptor publication and invitation bootstrap no longer need the legacy hint bridge. New code should consume LinkEndpoint, ServiceDescriptor, EstablishPath, MovePath, and HoldDescriptor views rather than treating transport hints as final routing policy.

3.1 Bootstrap Discovery Plane

Aura uses two distinct discovery planes during startup:

  1. Native rendezvous/LAN discovery for native-to-native startup on the same LAN.
  2. Broker-backed bootstrap discovery for any startup shape that involves the browser, including same-machine TUI to Web and same-LAN TUI to Web or Web to Web.

The bootstrap broker publishes ephemeral bootstrap descriptors only. It exists to help first-run instances discover one another and exchange invitation or device-enrollment material through the existing enrollment path. It does not create an ordinary rendezvous relationship on its own.

Bootstrap broker URLs carry only non-secret endpoint identifiers. Bearer credentials and invitation retrieval tokens travel in headers or another explicit credential channel, never in URL query strings, so logs, histories, referrers, and diagnostics do not capture secret material.

Bootstrap and stale-node re-entry remain distributed surfaces. Aura does not define a singleton bootstrap registry. Runtime code may combine remembered direct contacts, neighborhood discovery-board publications, bounded bootstrap introductions, and broker-backed ephemeral startup descriptors. aura-rendezvous owns schema validation for bootstrap contact hints and neighborhood re-entry hints, but it does not own final route choice or a global bootstrap cache.

Operationally, browser and native first-run startup may become bootstrap-visible only after the first native account runtime exists and begins hosting the broker automatically. For example, in a TUI plus Web startup, the user may create the web account first and the TUI account second. Once the TUI runtime is live, both instances should surface one another as bootstrap candidates without extra user setup steps.

3.2 Holepunching and Upgrade Policy

Aura uses a relay-first, direct-upgrade model for NAT traversal:

  1. Start on relay as soon as both peers have a valid descriptor path.
  2. Exchange direct/reflexive candidates from descriptor facts.
  3. Launch bounded direct upgrade attempts (holepunch) in the background.
  4. Promote to direct when a recoverable direct path succeeds. Otherwise remain on relay.

Retry state is tracked with typed generations (CandidateGeneration, NetworkGeneration) and bounded backoff (AttemptBudget, BackoffWindow) in PeerConnectionActor. Generation changes reset retry budgets, which avoids stale retry loops after interface/NAT changes.

Recoverability is evaluated from local binding/interface provenance, not from reflexive addresses alone. This prevents treating stale external mappings as viable direct paths.

Operationally:

  • Relay path is the safety baseline.
  • Direct holepunch is an optimization path.
  • Network changes can trigger a fresh upgrade cycle without dropping relay connectivity.

4. Data Structures

4.1 Domain Facts

Rendezvous uses domain facts in the relational context journal. Facts are propagated via journal synchronization.

#![allow(unused)]
fn main() {
/// Rendezvous domain facts stored in context journals
pub enum RendezvousFact {
    /// Transport descriptor advertisement
    Descriptor(RendezvousDescriptor),

    /// Channel established acknowledgment
    ChannelEstablished {
        initiator: AuthorityId,
        responder: AuthorityId,
        channel_id: [u8; 32],
        epoch: u64,
    },

    /// Descriptor revocation
    DescriptorRevoked {
        authority_id: AuthorityId,
        nonce: [u8; 32],
    },
}
}

4.2 Rendezvous Descriptors

#![allow(unused)]
fn main() {
/// Rendezvous descriptor for peer discovery
pub struct RendezvousDescriptor {
    /// Authority publishing this descriptor
    pub authority_id: AuthorityId,
    /// Context this descriptor is for
    pub context_id: ContextId,
    /// Legacy connectivity hints used to derive split views
    pub transport_hints: Vec<TransportHint>,
    /// Handshake PSK commitment (hash of PSK derived from context)
    pub handshake_psk_commitment: [u8; 32],
    /// Public key for Noise IK handshake (Ed25519 public key)
    pub public_key: [u8; 32],
    /// Validity window start (ms since epoch)
    pub valid_from: u64,
    /// Validity window end (ms since epoch)
    pub valid_until: u64,
    /// Nonce for uniqueness
    pub nonce: [u8; 32],
    /// What the peer wants to be called (optional, for UI purposes)
    pub nickname_suggestion: Option<String>,
}
}

RendezvousDescriptor is the authoritative shared object in the journal. Callers derive LinkEndpoint and ServiceDescriptor views from it. This keeps the fact schema stable during migration while preventing route policy from hardening into the fact format.

Bootstrap records follow the same rule. Shared bootstrap contact hints and neighborhood re-entry hints are typed, expiring, replay-bounded records. They are valid only inside their scoped contexts. They do not encode final route policy, trust tier, or canonical provider ranking.

4.3 Split Connectivity and Service Surfaces

#![allow(unused)]
fn main() {
pub struct LinkEndpoint {
    pub protocol: LinkProtocol,
    pub address: Option<String>,
    pub relay_authority: Option<AuthorityId>,
}

pub struct ServiceDescriptor {
    pub header: ServiceDescriptorHeader,
    pub kind: ServiceDescriptorKind,
}

pub struct EstablishPath {
    pub route: Route,
}

pub struct MovePath {
    pub route: Route,
}
}

LinkEndpoint answers how a peer may be reached. ServiceDescriptor answers what service family is being advertised. Runtime-owned selection state combines these views with local policy and social inputs. The selected provider should observe only the generic service action, not the social reason it was chosen.

EstablishPath and MovePath are the explicit path objects consumed by current bootstrap and movement flows. They are derived from descriptor views plus runtime-local policy, but they remain socially neutral: home, neighborhood, guardian, friend, and similar provider roles must not appear in the path schema itself.

Anonymous path establishment uses the same Establish family boundary. aura-rendezvous validates path objects and descriptor inputs. aura-agent owns reusable anonymous path lifecycle, route choice, and path reuse policy.

5. MPST Choreographies

Rendezvous protocols are defined as MPST choreographies with guard annotations.

5.1 Direct Exchange Protocol

#![allow(unused)]
fn main() {
tell! {
    #[namespace = "rendezvous_exchange"]
    protocol RendezvousExchange {
        roles: Initiator, Responder;

        // Initiator publishes descriptor (fact insertion, propagates via sync)
        Initiator[guard_capability = "rendezvous:publish",
                  journal_facts = "descriptor_offered"]
        -> Responder: DescriptorOffer(RendezvousDescriptor);

        // Responder publishes response descriptor
        Responder[guard_capability = "rendezvous:publish",
                  journal_facts = "descriptor_answered"]
        -> Initiator: DescriptorAnswer(RendezvousDescriptor);

        // Direct channel establishment
        Initiator[guard_capability = "rendezvous:connect",
        -> Responder: HandshakeInit(NoiseHandshake);

        Responder[guard_capability = "rendezvous:connect",
                  journal_facts = "channel_established"]
        -> Initiator: HandshakeComplete(NoiseHandshake);
    }
}
}

5.2 Relayed Protocol

#![allow(unused)]
fn main() {
tell! {
    #[namespace = "relayed_rendezvous"]
    protocol RelayedRendezvous {
        roles: Initiator, Relay, Responder;

        Initiator[guard_capability = "rendezvous:relay",]
        -> Relay: RelayRequest(RelayEnvelope);

        Relay[guard_capability = "relay:forward",
              leak = "neighbor:1"]
        -> Responder: RelayForward(RelayEnvelope);

        Responder[guard_capability = "rendezvous:relay"]
        -> Relay: RelayResponse(RelayEnvelope);

        Relay[guard_capability = "relay:forward",
              leak = "neighbor:1"]
        -> Initiator: RelayComplete(RelayEnvelope);
    }
}
}

6. Descriptor Propagation

Descriptors propagate via journal synchronization. This replaces custom flooding.

  1. Authority creates a RendezvousFact::Descriptor fact
  2. Guard chain evaluates the publication request
  3. On success, fact is inserted into the context journal
  4. Journal sync (aura-sync) propagates facts to context participants
  5. Peers query journal for peer descriptors

This model provides:

  • Deduplication: Journal sync handles duplicate facts naturally
  • Ordering: Facts have causal ordering via journal timestamps
  • Authorization: Guard chain validates before insertion
  • Consistency: Same propagation mechanism as other domain facts

6.1 aura-sync Integration

The aura-sync crate provides a RendezvousAdapter that bridges peer discovery with runtime-owned descriptor snapshots:

#![allow(unused)]
fn main() {
use aura_sync::infrastructure::RendezvousAdapter;

// Create adapter using the local authority identity.
let adapter = RendezvousAdapter::new(local_authority);

// Query peer info from a runtime-owned descriptor snapshot.
if let Some(peer_info) = adapter.get_peer_info(&descriptors, context_id, peer, now_ms) {
    if peer_info.has_direct_transport() {
        // Use direct connection
    }
}

// Check which peers need descriptor refresh
let stale_peers = adapter.peers_needing_refresh(&descriptors, context_id, now_ms);
}

The adapter is a pure view helper. It does not own or mutate the cache. The runtime cache stays in aura-agent.

6.2 Social Inputs and Route Selection

Rendezvous may consume socially rooted provider inputs, but it does not own social topology or trust evaluation. The Neighborhood Plane and Web of Trust Plane produce permit and candidate inputs. The runtime combines those inputs with descriptor views and local policy.

This separation is required for privacy. Shared descriptor facts must not expose route classes such as "friend relay" or "neighborhood hold". The selected provider should observe only the generic service action. The current descriptor view therefore advertises generic Hold surfaces for deferred-delivery and cache-replica retention, while selector issuance, holder rotation, and retrieval policy remain runtime-local. See Social Architecture for the plane split.

7. Protocol Flow

The rendezvous sequence uses the context between two authorities.

sequenceDiagram
    participant A as Authority A
    participant J as Context Journal
    participant B as Authority B

    A->>A: Build descriptor
    A->>A: Evaluate guards
    A->>J: Insert descriptor fact
    J-->>B: Sync descriptor fact
    B->>B: Query descriptor
    B->>A: Select transport
    A->>B: Noise IKpsk2 handshake
    B->>J: Record ChannelEstablished fact

8. Guard Chain Integration

All rendezvous operations flow through the guard chain.

8.1 Guard Capabilities

#![allow(unused)]
fn main() {
pub mod guards {
    pub const CAP_RENDEZVOUS_PUBLISH: &str = "rendezvous:publish";
    pub const CAP_RENDEZVOUS_CONNECT: &str = "rendezvous:connect";
    pub const CAP_RENDEZVOUS_RELAY: &str = "rendezvous:relay";
}
}

8.2 Flow Costs

#![allow(unused)]
fn main() {
pub const DESCRIPTOR_PUBLISH_COST: u32 = 1;
pub const CONNECT_DIRECT_COST: u32 = 2;
pub const CONNECT_RELAY_COST: u32 = 3;
pub const RELAY_FORWARD_COST: u32 = 1;
}

8.3 Guard Evaluation

The service prepares operations and returns GuardOutcome containing effect commands. The caller executes these commands.

#![allow(unused)]
fn main() {
// 1. Prepare snapshot of current state
let snapshot = GuardSnapshot {
    authority_id: alice,
    context_id: context,
    flow_budget_remaining: 100,
    capabilities: vec!["rendezvous:publish".into()],
    epoch: 1,
};

// 2. Prepare publication (pure, sync)
let outcome = service.prepare_publish_descriptor(
    &snapshot, context, transport_hints, now_ms
);

// 3. Check decision and execute effects
if outcome.decision.is_allowed() {
    for cmd in outcome.effects {
        execute_effect_command(cmd).await?;
    }
}
}

9. Secure Channel Establishment

After receiving a valid descriptor, the initiator selects a transport. Both sides run Noise IKpsk2 using a context-derived PSK. Successful handshake yields a SecureChannel.

9.1 Channel Structure

#![allow(unused)]
fn main() {
pub struct SecureChannel {
    /// Unique channel identifier
    channel_id: [u8; 32],
    /// Context this channel belongs to
    context_id: ContextId,
    /// Local authority
    local: AuthorityId,
    /// Remote peer
    remote: AuthorityId,
    /// Current epoch (for key rotation)
    epoch: u64,
    /// Channel state
    state: ChannelState,
    /// Agreement mode (A1/A2/A3) for the channel lifecycle
    agreement_mode: AgreementMode,
    /// Whether reversion is still possible
    reversion_risk: bool,
    /// Whether the channel needs key rotation
    needs_rotation: bool,
    /// Bytes sent on this channel (for flow budget tracking)
    bytes_sent: u64,
    /// Bytes received on this channel
    bytes_received: u64,
}

pub enum ChannelState {
    Establishing,
    Active,
    Rotating,
    Closed,
    Error(String),
}
}

9.2 Channel Manager

The ChannelManager tracks active channels:

#![allow(unused)]
fn main() {
let mut manager = ChannelManager::new();

// Register a new channel
manager.register(channel);

// Find channel by context and peer
if let Some(ch) = manager.find_by_context_peer(context, peer) {
    if ch.is_active() {
        // Use channel
    }
}

// Advance epoch and mark channels for rotation
manager.advance_epoch(new_epoch);
}

9.3 Handshake Flow

The Handshaker state machine handles Noise IKpsk2:

#![allow(unused)]
fn main() {
// Initiator side
let mut initiator = Handshaker::new(HandshakeConfig {
    local: alice,
    remote: bob,
    context_id: context,
    psk: derived_psk,
    timeout_ms: 5000,
});

let init_msg = initiator.create_init_message(epoch)?;
// ... send init_msg to responder ...
initiator.process_response(&response_msg)?;
let result = initiator.complete(epoch, true)?;
let channel = initiator.build_channel(&result)?;
}

9.4 Key Rotation

Channels support epoch-based key rotation. When the epoch advances, channels rekey using the new context-derived PSK.

#![allow(unused)]
fn main() {
impl SecureChannel {
    pub fn needs_epoch_rotation(&self, current_epoch: u64) -> bool {
        self.epoch < current_epoch
    }

    pub fn rotate(&mut self, new_epoch: u64) -> AuraResult<()> {
        // Rekey the channel
        self.state = ChannelState::Rotating;
        self.epoch = new_epoch;
        self.needs_rotation = false;
        self.state = ChannelState::Active;
        Ok(())
    }
}
}

10. Service Interface

The rendezvous service coordinates descriptor publication and channel establishment.

#![allow(unused)]
fn main() {
impl RendezvousService {
    /// Create a new rendezvous service
    pub fn new(authority_id: AuthorityId, config: RendezvousConfig) -> Self;

    /// Prepare to publish descriptor to context journal
    pub fn prepare_publish_descriptor(
        &self,
        snapshot: &GuardSnapshot,
        context_id: ContextId,
        transport_hints: Vec<TransportHint>,
        now_ms: u64,
    ) -> GuardOutcome;

    /// Prepare to establish channel with peer
    pub fn prepare_establish_channel(
        &self,
        snapshot: &GuardSnapshot,
        context_id: ContextId,
        peer: AuthorityId,
        psk: &[u8; 32],
    ) -> AuraResult<GuardOutcome>;

    /// Prepare to handle incoming handshake
    pub fn prepare_handle_handshake(
        &self,
        snapshot: &GuardSnapshot,
        context_id: ContextId,
        initiator: AuthorityId,
        handshake: NoiseHandshake,
        psk: &[u8; 32],
    ) -> GuardOutcome;

    /// Cache a peer's descriptor (from journal sync)
    pub fn cache_descriptor(&mut self, descriptor: RendezvousDescriptor);

    /// Get a cached descriptor
    pub fn get_cached_descriptor(
        &self,
        context_id: ContextId,
        peer: AuthorityId,
    ) -> Option<&RendezvousDescriptor>;

    /// Check if our descriptor needs refresh
    pub fn needs_refresh(
        &self,
        context_id: ContextId,
        now_ms: u64,
        refresh_window_ms: u64,
    ) -> bool;
}
}

11. Effect Commands

The service returns GuardOutcome with effect commands to execute:

#![allow(unused)]
fn main() {
pub enum EffectCommand {
    /// Append fact to journal
    JournalAppend { fact: RendezvousFact },
    /// Charge flow budget
    ChargeFlowBudget { cost: FlowCost },
    /// Send handshake init message
    SendHandshake { peer: AuthorityId, message: HandshakeInit },
    /// Send handshake response
    SendHandshakeResponse { peer: AuthorityId, message: HandshakeComplete },
    /// Record operation receipt
    RecordReceipt { operation: String, peer: AuthorityId },
}
}

12. Failure Modes and Privacy

Failures occur during guard evaluation, descriptor validation, or transport establishment. These failures are local. No network packets reveal capability or budget failures.

Context isolation prevents unauthorized authorities from reading descriptors. Transport hints do not reveal authority structure. Relay identifiers reveal only the relay authority. Descriptor contents remain encrypted for transit.

13. Summary

Rendezvous provides encrypted peer discovery and channel establishment scoped to relational contexts. Descriptors propagate through journal synchronization with guard chain enforcement. Secure channels use Noise IKpsk2 and QUIC. All behavior remains private to the context and reveals no structural information. The architecture uses standard Aura primitives: domain facts, guard chains, MPST choreographies, and effect interpretation.

Relational Contexts

This document describes the architecture of relational contexts in Aura. It explains how cross-authority relationships are represented using dedicated context namespaces. It defines the structure of relational facts and the role of Consensus in producing agreed relational state. It also describes privacy boundaries and the interpretation of relational data by participating authorities.

Relational contexts are distinct from authority types. In particular, Neighborhood is modeled as an authority type in the Neighborhood Plane as described in Social Architecture. Bilateral trust in the Web of Trust Plane is modeled as relational-context state, not as a new authority type.

1. RelationalContext Abstraction

A relational context is shared state linking two or more authorities. A relational context has its own Journal namespace. A relational context does not expose internal authority structure. A relational context contains only the facts that the participating authorities choose to share.

A relational context is identified by a ContextId. Authorities publish relational facts inside the context journal. The context journal is a join semilattice under set union. Reduction produces a deterministic relational state.

#![allow(unused)]
fn main() {
pub struct ContextId(Uuid);
}

This identifier selects the journal namespace for a relational context. It does not encode participant information. It does not reveal the type of relationship. Only the participants know how the context is used.

2. Participants and Fact Store

A relational context has a defined set of participating authorities. This set is not encoded in the ContextId. Participation is expressed by writing relational facts to the context journal. Each fact references the commitments of the participating authorities.

#![allow(unused)]
fn main() {
/// Relational facts in `aura-journal/src/fact.rs`
pub enum RelationalFact {
    /// Protocol-level facts with core reduction semantics
    Protocol(ProtocolRelationalFact),
    /// Domain-level extensibility facts
    Generic { context_id: ContextId, envelope: FactEnvelope },
}

/// Typed fact envelope (from aura-core/src/types/facts.rs)
pub struct FactEnvelope {
    pub type_id: FactTypeId,
    pub schema_version: u16,
    pub encoding: FactEncoding,
    pub payload: Vec<u8>,
}
}

The Protocol variant wraps core protocol facts that have specialized reduction logic in reduce_context(). These include GuardianBinding, RecoveryGrant, Consensus, AMP channel facts, DKG transcript commits, and lifecycle markers. The Generic variant provides extensibility for domain-specific facts using FactEnvelope which contains a typed payload with schema versioning.

AMP channel facts include checkpoints, transition proposals, A2 certificates, A3 finalizations, abort evidence, conflict evidence, supersession evidence, and emergency alarms. These facts remain scoped to the relational context. They do not change authority-root membership, recovery rights, guardian bindings, or account commitment trees.

Domain crates implement the DomainFact trait from aura-journal/src/extensibility.rs. They store facts via DomainFact::to_generic() which produces a FactEnvelope. The runtime registers reducers in crates/aura-agent/src/fact_registry.rs so reduce_context() can process Generic facts into RelationalBinding values.

3. Prestate Model

Relational context operations use a prestate model. Aura Consensus verifies that all witnesses see the same authority states. The prestate hash binds the relational fact to current authority states.

#![allow(unused)]
fn main() {
let prestate_hash = H(C_auth1, C_auth2, C_context);
}

This hash represents the commitments of the authorities and the current context. Aura Consensus witnesses check this hash before producing shares. The final commit fact includes a threshold signature over the relational operation.

4. Types of Relational Contexts

Several categories of relational contexts appear in Aura. Each fact type carries a well-defined schema to ensure deterministic reduction.

Neighborhood governance and home-membership state are recorded in neighborhood authority journals, not in relational context journals. Relational contexts are still used for pairwise and small-group cross-authority relationships such as guardian bindings, recovery grants, and application-specific collaboration contexts.

WebOfTrustContext

Direct friendship is modeled as a bilateral relational context between authorities. Contact remains unilateral reachability or identification state. Friendship requires explicit bilateral acceptance. Friendship is not represented as a new authority object.

The context stores lifecycle facts such as proposal, acceptance, and revocation. It may also store bounded trust-introduction artifacts. Those artifacts carry expiry, remaining depth, and fan-out limits. Runtime policy may use the resulting evidence as permit input, but the shared facts do not hard-code runtime policy tiers.

Bounded bootstrap introductions follow the same model. A direct friend may publish an introduction artifact that helps stale-node re-entry or first-contact bootstrap, but the artifact remains scoped to the relational context, carries explicit expiry and replay bounds, and does not materialize a shared friend-of-friend graph. Runtime code may amplify re-entry attempts from that evidence only within the stated depth and fan-out limits.

GuardianConfigContext

Stores GuardianBinding facts:

{
  "type": "GuardianBinding",
  "account_commitment": "Hash32",
  "guardian_commitment": "Hash32",
  "parameters": {
    "recovery_delay_secs": 86400,
    "notification_required": true
  },
  "consensus_commitment": "Hash32",
  "consensus_proof": "ThresholdSignature"
}
  • account_commitment: reduced commitment of the protected authority.
  • guardian_commitment: reduced commitment of the guardian authority.
  • parameters: serialized GuardianParameters (delay, notification policy, etc.).
  • consensus_commitment: commitment hash of the Aura Consensus instance that approved the binding.
  • consensus_proof: aggregated signature from the witness set.

GuardianRecoveryContext

Stores RecoveryGrant facts:

{
  "type": "RecoveryGrant",
  "account_commitment_old": "Hash32",
  "account_commitment_new": "Hash32",
  "guardian_commitment": "Hash32",
  "operation": {
    "kind": "ReplaceTree",
    "payload": "Base64(TreeOp)"
  },
  "consensus_commitment": "Hash32",
  "consensus_proof": "ThresholdSignature"
}
  • account_commitment_old/new: before/after commitments for the account authority.
  • guardian_commitment: guardian authority that approved the grant.
  • operation: serialized recovery operation (matches TreeOp schema).
  • consensus_*: Aura Consensus identifiers tying the grant to witness approvals.

Generic / Application Contexts

Shared group or project contexts can store application-defined facts:

{
  "type": "Generic",
  "payload": "Base64(opaque application data)",
  "bindings": ["Hash32 commitment of participant A", "Hash32 commitment of participant B"],
  "labels": ["project:alpha", "role:reviewer"]
}

Generic facts should include enough metadata (bindings, optional labels) for interpreters to apply context-specific rules.

5. Relational Facts

Relational facts express specific cross-authority operations. A GuardianBinding fact defines the guardian authority for an account while a RecoveryGrant fact defines an allowed update to the account state. A Generic fact covers application defined interactions. Consensus-backed facts include the consensus_commitment and aggregated signature so reducers can verify provenance even after witnesses rotate.

SessionDelegation protocol facts record runtime endpoint transfer events. Aura emits these facts when session ownership moves across authorities (for example, guardian handoff or device migration). Each delegation fact includes source authority, destination authority, session id, optional bundle id, and timestamp so reconfiguration decisions remain auditable.

Reduction applies all relational facts to produce relational state. Reduction verifies that authority commitments in each fact match the current reduced state of each authority.

#![allow(unused)]
fn main() {
/// Reduced relational state from aura-journal/src/reduction.rs
pub struct RelationalState {
    /// Active relational bindings
    pub bindings: Vec<RelationalBinding>,
    /// Flow budget state by context
    pub flow_budgets: BTreeMap<(AuthorityId, AuthorityId, u64), u64>,
    /// Leakage budget totals for privacy accounting
    pub leakage_budget: LeakageBudget,
    /// AMP channel epoch state keyed by channel id
    pub channel_epochs: BTreeMap<ChannelId, ChannelEpochState>,
}

pub struct RelationalBinding {
    pub binding_type: RelationalBindingType,
    pub context_id: ContextId,
    pub data: Vec<u8>,
}
}

This structure represents the reduced relational state. It contains relational bindings, flow budget tracking between authorities, leakage budget totals for privacy accounting, and AMP channel epoch state for message ratcheting. Reduction processes all facts in the context journal to derive this state deterministically.

6. Aura Consensus in Relational Contexts

Some relational operations require strong agreement. Consensus produces these operations. Aura Consensus uses a witness set drawn from participating authorities. Witnesses compute shares after verifying the prestate hash.

Commit facts contain threshold signatures. Each commit fact is inserted into the relational context journal. Reduction interprets the commit fact as a confirmed relational operation.

Aura Consensus binds relational operations to authority state. After consensus completes, the initiator inserts a CommitFact into the relational context journal that includes:

#![allow(unused)]
fn main() {
pub struct RelationalCommitFact {
    pub context_id: ContextId,
    pub consensus_commitment: Hash32, // H(Op, prestate)
    pub fact: RelationalFact,         // GuardianBinding, RecoveryGrant, etc.
    pub aggregated_signature: ThresholdSignature,
    pub attesters: BTreeSet<AuthorityId>,
}
}

Reducers validate aggregated_signature before accepting the embedded RelationalFact. This mirrors the account-level process but scoped to the context namespace.

7. Interpretation of Relational State

Authorities interpret relational state by reading the relational context journal. An account reads GuardianBinding facts to determine its guardian authority. A guardian authority reads the same facts to determine which accounts it protects.

Other relational contexts follow similar patterns. Each context defines its own interpretation rules. No context affects authority internal structure. Context rules remain confined to the specific relationship.

For web-of-trust contexts, direct-friend state is authoritative shared state. Friend-of-friend state is not. Runtime components derive introduced or transitive trust locally from direct-friend edges plus bounded introduction artifacts. This avoids turning the transitive social graph into canonical shared state.

8. Privacy and Isolation

A relational context does not reveal its participants. The ContextId is opaque. Only participating authorities know the relationship. No external party can infer connections between authorities based on context identifiers.

Profile information shared inside a context stays local to that context. Nickname suggestions and contact attributes do not leave the context journal. Each context forms a separate identity boundary. Authorities can maintain many unrelated relationships without cross linking.

Trust evidence must follow the same boundary rule. Shared facts may record bilateral friend lifecycle state and bounded introductions. They must not expose wire-visible service classes such as direct_friend_relay or introduced_hold. Any coarse policy tier is derived locally in the runtime from evidence provenance.

Bootstrap use does not change that rule. A bounded bootstrap introduction is evidence, not a route commitment. It can seed local runtime discovery or stale-node re-entry, but it must not become a canonical shared topology edge or a shared provider-class label.

9. Implementation Patterns

The implementation in aura-relational provides concrete patterns for working with relational contexts.

Creating and Managing Contexts

#![allow(unused)]
fn main() {
use aura_core::{AuthorityId, ContextId};
use aura_relational::RelationalContext;

// Create a new guardian-account relational context
let account_authority = AuthorityId::from_entropy([1u8; 32]);
let guardian_authority = AuthorityId::from_entropy([2u8; 32]);

let context = RelationalContext::new(vec![account_authority, guardian_authority]);

// Or use a specific context ID
let context_id = ContextId::from_entropy([3u8; 32]);
let context = RelationalContext::with_id(
    context_id,
    vec![account_authority, guardian_authority],
);

// Check participation
assert!(context.is_participant(&account_authority));
assert!(context.is_participant(&guardian_authority));
}

Guardian Binding Pattern

#![allow(unused)]
fn main() {
use aura_core::relational::{GuardianBinding, GuardianParameters};
use std::time::Duration;

let params = GuardianParameters {
    recovery_delay: Duration::from_secs(86400),
    notification_required: true,
    // Use Aura's unified time system (PhysicalTimeEffects/TimeStamp) for expiration.
    expiration: None,
};

let binding = GuardianBinding::new(
    Hash32::from_bytes(&account_authority.to_bytes()),
    Hash32::from_bytes(&guardian_authority.to_bytes()),
    params,
);

// Store a binding receipt + full payload via the context's journal-backed API
context.add_guardian_binding(account_authority, guardian_authority, binding)?;
}

Recovery Grant Pattern

#![allow(unused)]
fn main() {
use aura_core::relational::{RecoveryGrant, RecoveryOp, ConsensusProof};

// Construct a recovery operation
let recovery_op = RecoveryOp::AddDevice {
    device_public_key: new_device_pubkey.to_bytes(),
};

// Create recovery grant (requires consensus proof)
let grant = RecoveryGrant {
    account_old: old_tree_commitment,
    account_new: new_tree_commitment,
    guardian: guardian_commitment,
    operation: recovery_op,
    consensus_proof: consensus_result.proof, // From Aura Consensus
};

// Add to recovery context
recovery_context.add_fact(RelationalFact::RecoveryGrant(grant))?;

// Check operation type
if grant.operation.is_emergency() {
    // Handle emergency operations immediately
    execute_emergency_recovery(&grant)?;
}
}

Query Patterns

#![allow(unused)]
fn main() {
// Query guardian bindings
let bindings = context.guardian_bindings();
for binding in bindings {
    println!("Guardian: {:?}", binding.guardian_commitment);
    println!("Recovery delay: {:?}", binding.parameters.recovery_delay);

    if let Some(expiration) = binding.parameters.expiration {
        if expiration < Utc::now() {
            // Binding has expired
        }
    }
}

// Find specific guardian binding
if let Some(binding) = context.get_guardian_binding(account_authority) {
    // Found guardian for this account
    let guardian_id = binding.guardian_commitment;
}

// Query recovery grants
let grants = context.recovery_grants();
for grant in grants {
    println!("Operation: {}", grant.operation.description());
    println!("From: {:?} -> To: {:?}", grant.account_old, grant.account_new);
}
}

Generic Binding Pattern

#![allow(unused)]
fn main() {
use aura_core::relational::{GenericBinding, RelationalFact};
use aura_core::types::facts::{FactEnvelope, FactTypeId, FactEncoding};

// Application-specific binding (e.g., project collaboration)
let payload = serde_json::to_vec(&ProjectMetadata {
    name: "Alpha Project",
    role: "Reviewer",
    permissions: vec!["read", "comment"],
})?;

let envelope = FactEnvelope {
    type_id: FactTypeId::new("project_collaboration"),
    schema_version: 1,
    encoding: FactEncoding::Json,
    payload,
};

let generic = GenericBinding::new(envelope, None); // No consensus proof

context.add_fact(RelationalFact::Generic(generic))?;
}

Prestate Computation Pattern

#![allow(unused)]
fn main() {
use aura_core::Prestate;

// Collect current authority commitments
let authority_commitments = vec![
    (account_authority, account_tree_commitment),
    (guardian_authority, guardian_tree_commitment),
];

// Compute prestate for consensus
let prestate = context.compute_prestate(authority_commitments);

// Use prestate in consensus protocol
let consensus_result = run_consensus(
    prestate,
    operation_data,
    witness_set,
).await?;
}

Journal Commitment Pattern

#![allow(unused)]
fn main() {
// Get deterministic commitment of current relational state
let commitment = context.journal.compute_commitment();

// Commitment is deterministic - all replicas compute same value
// Used for:
// - Prestate computation in consensus
// - Context verification
// - Anti-entropy sync checkpoints
}

Integration with Aura Consensus

#![allow(unused)]
fn main() {
use aura_consensus::relational::run_consensus;
use aura_core::relational::ConsensusProof;
use aura_effects::random::RealRandomHandler;
use aura_effects::time::PhysicalTimeHandler;

// 1. Prepare operation requiring consensus
let binding = GuardianBindingBuilder::new()
    .account(account_commitment)
    .guardian(guardian_commitment)
    .build()?;

// 2. Compute prestate
let prestate = context.compute_prestate(authority_commitments);

// 3. Run consensus
let random = RealRandomHandler;
let time = PhysicalTimeHandler;
let consensus_proof = run_consensus(
    context.context_id,
    &prestate,
    &binding,
    key_packages,
    group_public_key,
    epoch,
    &random,
    &time,
).await?;

// 4. Attach proof to fact
let binding_with_proof = GuardianBinding {
    consensus_proof: Some(consensus_proof),
    ..binding
};

// 5. Add to context
context.add_fact(RelationalFact::GuardianBinding(binding_with_proof))?;
}

Recovery Operation Selection

#![allow(unused)]
fn main() {
use aura_core::relational::RecoveryOp;

// Select appropriate recovery operation
let recovery_op = match recovery_scenario {
    RecoveryScenario::LostAllDevices => RecoveryOp::ReplaceTree {
        new_tree_root: new_tree_commitment,
    },
    RecoveryScenario::AddNewDevice => RecoveryOp::AddDevice {
        device_public_key: device_key.to_bytes(),
    },
    RecoveryScenario::RemoveCompromised => RecoveryOp::RemoveDevice {
        leaf_index: compromised_device_index,
    },
    RecoveryScenario::ChangeThreshold => RecoveryOp::UpdatePolicy {
        new_threshold: new_m_of_n.0,
    },
    RecoveryScenario::EmergencyCompromise => RecoveryOp::EmergencyRotation {
        new_epoch: current_epoch + 1,
    },
};

// Emergency operations bypass delay
if recovery_op.is_emergency() {
    // No recovery_delay applied
} else {
    // Wait for binding.parameters.recovery_delay
}
}

Best Practices

Guardian configuration:

  • Use 24-hour minimum recovery delay for security
  • Always require notification unless emergency scenario
  • Set expiration for temporary guardian relationships
  • Rotate guardian bindings periodically

Recovery grants:

  • Always require consensus proof for recovery operations
  • Validate prestate commitments before accepting grants
  • Log all recovery operations for audit trail
  • Emergency operations should be rare and logged prominently

Generic bindings:

  • Document your FactTypeId schemas externally
  • Use schema_version field to version your payload format
  • Include consensus_proof for critical application bindings
  • Keep payload size reasonable for sync performance

Context management:

  • Use opaque ContextId - never encode participant info
  • Limit participants to 2-10 authorities for efficiency
  • Separate contexts for different relationship types
  • Garbage collect expired bindings periodically

Lifecycle notes:

  • A1/A2 facts are usable immediately but are provisional. Any durable relational state must be A3 (consensus-finalized).
  • Soft-safe operations should emit ConvergenceCert and ReversionFact protocol facts to make convergence and reversion risk explicit.

10. Contexts vs Channels: Operation Categories

Understanding the distinction between relational contexts and channels is essential for operation categorization.

10.1 Relational Contexts (Category C - Consensus Required)

Creating a relational context establishes a cryptographic relationship between authorities. This is a Category C (consensus-gated) operation because:

  • It creates the shared secret foundation for all future communication
  • Both parties must agree to establish the relationship
  • Partial state (one party thinks relationship exists, other doesn't) is dangerous

Examples:

  • Adding a contact (bilateral context between two authorities)
  • Creating a group (multi-party context with all members)
  • Adding a member to an existing group (extends the cryptographic context)

10.2 Channels Within Contexts (Category A - Optimistic)

Once a relational context exists, channels are Category A (optimistic) operations:

  • Channels are just organizational substreams within the context
  • No new cryptographic agreement needed - keys derive from context
  • Channel facts sync via anti-entropy, eventual consistency is sufficient

Examples within existing context:

  • Create channel → emit ChannelCheckpoint fact
  • Send message → derive key from context, encrypt, send
  • Update topic → emit fact to context journal

10.3 The Cost Structure

                    BILATERAL                      MULTI-PARTY
                    (2 members)                    (3+ members)
                    ───────────                    ────────────
Context Creation    Invitation ceremony            Group ceremony
                    (Category C - expensive)       (Category C - expensive)

Member Addition     N/A (already 2)               Per-member ceremony
                                                  (Category C - expensive)

Channel Creation    Optimistic                    Optimistic
                    (Category A - cheap)          (Category A - cheap)

Messages            Optimistic                    Optimistic
                    (Category A - cheap)          (Category A - cheap)

The expensive part is establishing WHO is in the group. Once that's established, operations WITHIN the group are cheap.

10.4 Multi-Party Context Keys

Groups with >2 members derive keys from all member tree roots:

GroupContext {
    context_id: ContextId,
    members: [Alice, Bob, Carol],
    group_secret: DerivedFromMemberTreeRoots,
    epoch: u64,
}

Key Derivation:
1. Each member contributes their tree root commitment
2. Group secret = KDF(sorted_member_roots, context_id)
3. Channel key = KDF(group_secret, channel_id, epoch)

All members derive the SAME group secret from the SAME inputs

10.5 Why Membership Changes Require Ceremony

Group membership changes are Category C because they affect encryption:

  1. Forward Secrecy: New members shouldn't read old messages

    • Solution: Epoch rotation, new keys for new messages
  2. Post-Compromise Security: Removed members shouldn't read new messages

    • Solution: Epoch rotation, re-derive group secret without removed member
  3. Consistency: All members must agree on who's in the group

    • Solution: Ceremony ensures atomic membership view

See Operation Categories for the full decision tree.

11. Summary

Relational contexts represent cross-authority relationships in Aura. They provide shared state without revealing authority structure and support guardian configuration, recovery, and application specific collaboration. Aura Consensus ensures strong agreement where needed while deterministic reduction ensures consistent relational state. Privacy boundaries isolate each relationship from all others.

The implementation provides concrete types (RelationalContext, GuardianBinding, RecoveryGrant) with builder patterns, query methods, and consensus integration. All relational facts are stored in a CRDT journal with deterministic commitment computation.

12. Implementation References

  • Core Types: aura-core/src/relational/ - RelationalFact, GuardianBinding, RecoveryGrant, ConsensusProof domain types
  • Journal Facts: aura-journal/src/fact.rs - Protocol-level RelationalFact with AMP variants
  • Reduction: aura-journal/src/reduction.rs - RelationalState, reduce_context()
  • Context Management: aura-relational/src/lib.rs - RelationalContext (context-scoped fact journal mirror + helpers)
  • Consensus Integration: crates/aura-consensus/src/consensus/relational.rs - consensus implementation
  • Consensus Integration: aura-consensus/src/relational.rs - direct relational consensus with explicit time/random effect ownership
  • Prestate Computation: aura-core/src/domain/consensus.rs - Prestate struct and methods
  • Protocol Usage: aura-authentication/src/guardian_auth_relational.rs - Guardian authentication
  • Recovery Flows: aura-recovery/src/ - Guardian recovery choreographies

See Also

  • Operation Categories - Ceremony contract, guardian rotation, device enrollment, and Category B/C operations
  • Consensus - Aura Consensus protocol for strong agreement

Social Architecture

This document defines Aura's social organization model using two social planes plus the message contexts that run on top of them. The system layers privacy, consent, and governance into a Neighborhood Plane, a Web of Trust Plane, and the message contexts that consume their outputs.

1. Overview

1.1 Design Goals

The model produces human-scaled social structures with natural scarcity based on physical analogs. Organic community dynamics emerge from bottom-up governance. The design aligns with Aura's consent-based privacy guarantees and capability-based authorization.

1.2 Social Planes

The Neighborhood Plane models homes, neighborhoods, membership, moderation, and locality-scoped infrastructure. It answers who shares a local social space and which neighborhood-scoped providers are admissible.

The Web of Trust Plane models bilateral friend relationships and bounded introductions. It answers which providers have direct or introduced trust evidence. It does not own final route selection. It does not materialize the transitive trust graph as shared state.

Messages are communication contexts built on top of those planes. Direct messages are private relational contexts. Home messages are semi-public messaging for home members and participants.

1.3 Terminology

An authority (AuthorityId) is the cryptographic identity that holds capabilities and participates in consensus. A nickname is a local mapping from an authority to a human-understandable name. Each device maintains its own nickname mappings. There is no global username registry.

A nickname suggestion (nickname_suggestion) is metadata an authority optionally shares when connecting with someone. Users configure a default suggestion sent to all new connections. Users can share different suggestions with different people or opt out entirely.

Contact is unilateral reachability or identification state. It means the local user knows how to recognize or reach an authority. It does not imply bilateral trust.

Friend is bilateral accepted trust in the Web of Trust Plane. Friend lifecycle facts live in relational contexts. Friends of friends are local derivations or bounded introduction evidence, not canonical shared graph state.

1.4 Unified Naming Pattern

The codebase uses a consistent naming pattern across entities (contacts, devices, discovered peers). The EffectiveName trait in aura-app/src/views/naming.rs defines the resolution order:

  1. Local nickname (user-assigned override) if non-empty
  2. Shared nickname_suggestion (what entity wants to be called) if non-empty
  3. Fallback identifier (truncated authority/device ID)

This pattern ensures consistent display names across all UI surfaces while respecting both local preferences and shared suggestions.

2. Message Types

2.1 Direct Messages

Direct messages are small private relational contexts built on AMP. There is no hop-based expansion across homes. All participants must be explicitly added. New members do not receive historical message sync.

2.2 Home Messages

Home messages are semi-public messaging for home members and participants. They use the same AMP infrastructure as direct messages. When a new participant joins, current members send a window of recent messages.

Membership and participation are tied to home policy. Leaving the home revokes access. Multiple channels may exist per home for different purposes.

#![allow(unused)]
fn main() {
pub struct HomeMessage {
    home_id: HomeId,
    channel: String,
    content: Vec<u8>,
    author: AuthorityId,
    timestamp: TimeStamp,
}
}

The message structure identifies the home, channel, content, author, and timestamp. Historical sync is configurable, typically the last 500 messages.

3. Home Architecture

3.1 Home Structure

A home is an authority-scoped context with its own journal. The total storage allocation is 10 MB. Capability templates define limited, partial, full, participant, and moderator patterns. Local governance is encoded via policy facts.

#![allow(unused)]
fn main() {
pub struct Home {
    /// Unique identifier for this home
    pub home_id: HomeId,
    /// Total storage limit in bytes
    pub storage_limit: u64,
    /// Maximum number of participants
    pub max_participants: u8,
    /// Maximum number of neighborhoods this home can join
    pub neighborhood_limit: u8,
    /// Current participants (authority IDs)
    pub participants: Vec<AuthorityId>,
    /// Current moderators with their capabilities
    pub moderators: Vec<(AuthorityId, ModeratorCapabilities)>,
    /// Current storage budget tracking
    pub storage_budget: HomeStorageBudget,
}
}

The home structure contains the identifier, storage limit, configuration limits, participant list, moderator designation list with capabilities, and storage budget tracking.

3.2 Membership and Participation

Home participation derives from possessing capability bundles, meeting entry requirements defined by policy, and allocating participant-specific storage. In v1, each user belongs to exactly one home. Home creation is user-initiated: accounts do not receive a home automatically at account creation. Instead, a user creates their home from the Neighborhood screen when they are ready to start connecting with others and participating in social features.

Joining a home follows a defined sequence. The authority requests capability. Home governance approves using local policy via Biscuit evaluation and consensus. The authority accepts the capability bundle and allocates storage. Historical home messages sync from current members.

3.3 Moderator Designation

Moderators are designated via governance decisions in the home. A moderator must also be a member. Moderator capability bundles include moderation, pin and unpin operations, and governance facilitation. Moderator designation is auditable because capability issuance is visible via relational facts.

4. Neighborhood Architecture

4.1 Neighborhood Structure

A neighborhood is an authority type linking multiple homes. It contains a combined pinned infrastructure pool equal to the number of homes times 1 MB. A 1-hop link graph connects homes. Access-level and inter-home policy logic define movement rules.

#![allow(unused)]
fn main() {
pub struct Neighborhood {
    /// Unique identifier for this neighborhood
    pub neighborhood_id: NeighborhoodId,
    /// Member homes
    pub member_homes: Vec<HomeId>,
    /// 1-hop links between homes
    pub one_hop_links: Vec<(HomeId, HomeId)>,
}
}

The neighborhood structure contains the identifier, member homes, and 1-hop link edges.

4.2 Home Membership

Homes allocate 1 MB of their budget per neighborhood joined. In v1, each home may join a maximum of 4 neighborhoods. This limits 1-hop graph complexity and effect delegation routing.

5. Position and Traversal

5.1 Neighborhood Discovery Layers

Neighborhood-scoped discovery is represented through the DiscoveryLayer enum. It indicates the best neighborhood strategy to reach a target based on locality relationships:

#![allow(unused)]
fn main() {
pub enum DiscoveryLayer {
    /// No relationship with target - must use rendezvous/flooding discovery
    Rendezvous,
    /// We have neighborhood presence and can use traversal
    Neighborhood,
    /// Target is reachable via home-level relay
    Home,
    /// Target is personally known - we have a direct relationship
    Direct,
}
}

The discovery layer determines locality-aware discovery strategy and flow costs. aura-social may classify neighborhood candidates using these layers, but it does not own the final route choice. Runtime policy in aura-agent fuses neighborhood candidates with web-of-trust evidence and descriptor views.

Neighborhood discovery boards are bounded hint surfaces inside this plane. A board publication may advertise a signed, expiring, replay-bounded re-entry hint for stale-node bootstrap, but it is not a topology map. The publication does not enumerate the neighborhood graph and does not commit a runtime route. It only exposes enough scoped hint material for local runtime discovery to try a candidate.

5.2 Movement Rules

Movement is possible when a Biscuit capability authorizes entry, neighborhood policy allows movement along a 1-hop link, and home policy or invitations allow deeper access levels. Movement does not replicate pinned data. Visitors operate on ephemeral local state.

Traversal does not reveal global identity. Only contextual identities within encountered homes are visible.

6. Storage Constraints

6.1 Block-Level Allocation

Homes have a fixed size of 10 MB total. Allocation depends on neighborhood participation.

NeighborhoodsAllocationParticipant StorageShared Storage
11.0 MB1.6 MB7.4 MB
22.0 MB1.6 MB6.4 MB
33.0 MB1.6 MB5.4 MB
44.0 MB1.6 MB4.4 MB

More neighborhood connections mean less local storage for home culture. This creates meaningful trade-offs.

6.2 Flow Budget Integration

Storage constraints are enforced via the flow budget system.

#![allow(unused)]
fn main() {
pub struct HomeFlowBudget {
    /// Home ID (typed identifier)
    pub home_id: HomeId,
    /// Current number of participants
    pub participant_count: u8,
    /// Storage used by participants (spent counter as fact)
    pub participant_storage_spent: u64,
    /// Number of neighborhoods joined
    pub neighborhood_count: u8,
    /// Total neighborhood allocations
    pub neighborhood_allocations: u64,
    /// Storage used by pinned content (spent counter as fact)
    pub pinned_storage_spent: u64,
}
}

The spent counters are persisted as journal facts. The count fields track current membership. Limits are derived at runtime from home policy and Biscuit capabilities. Participant storage limit is 1.6 MB for 8 participants at 200 KB each.

7. Fact Schema

7.1 Home Facts

Home facts enable Datalog queries.

home(home_id, created_at, storage_limit).
home_config(home_id, max_participants, neighborhood_limit).
participant(authority_id, home_id, joined_at, storage_allocated).
moderator(authority_id, home_id, designated_by, designated_at, capabilities).
pinned(content_hash, home_id, pinned_by, pinned_at, size_bytes).

These facts express home existence, configuration, participation, moderator designation, and pin state.

7.2 Neighborhood Facts

Neighborhood facts express neighborhood existence, home membership, 1-hop links, and access permissions.

neighborhood(neighborhood_id, created_at).
home_member(home_id, neighborhood_id, joined_at, allocated_storage).
one_hop_link(home_a, home_b, neighborhood_id).
access_allowed(from_home, to_home, capability_requirement).

7.3 Query Examples

Queries use Biscuit Datalog.

participants_of(Home) <- participant(Auth, Home, _, _).

visitable(Target) <-
    participant(Me, Current, _, _),
    one_hop_link(Current, Target, _),
    access_allowed(Current, Target, Cap),
    has_capability(Me, Cap).

The first query finds all participants of a home. The second finds homes a user can visit from their current position via 1-hop links.

Neighborhood discovery-board records are intentionally separate from these canonical topology facts. They are advisory publications consumed by runtime-local bootstrap selection. They must not change Neighborhood, SocialTopology, or other canonical neighborhood views into route-truth owners.

8. IRC-Style Commands

8.1 User Commands

User commands are available to all participants.

CommandDescriptionCapability
/msg <user> <text>Send private messagesend_dm
/me <action>Send actionsend_message
/nick <name>Update contact suggestionupdate_contact
/whoList participantsview_members
/leaveLeave current contextleave_context

8.2 Moderator Commands

Moderator commands require moderator capabilities, and moderators must be members.

CommandDescriptionCapability
/kick <user>Remove from homemoderate:kick
/ban <user>Ban from homemoderate:ban
/mute <user>Silence usermoderate:mute
/pin <msg>Pin messagepin_content

8.3 Command Execution

Commands execute through the guard chain.

flowchart LR
    A[Parse Command] --> B[CapGuard];
    B --> C[FlowGuard];
    C --> D[JournalCoupler];
    D --> E[TransportEffects];

The command is parsed into a structured type. CapGuard checks capability requirements. FlowGuard charges the moderation action budget. JournalCoupler commits the action fact. TransportEffects notifies affected parties.

9. Governance

9.1 Home Governance

Homes govern themselves through capability issuance, consensus-based decisions, member moderation designations, and moderation. Home governance uses Aura Consensus for irreversible or collective decisions.

9.2 Neighborhood Governance

Neighborhoods govern home admission, 1-hop graph maintenance, access rules, and shared civic norms. High-stakes actions use Aura Consensus.

10. Privacy Model

10.1 Contextual Identity

Identity in Aura is contextual and relational. Joining a home reveals a home-scoped identity. Leaving a home causes that contextual identity to disappear. Profile data shared inside a context stays local to that context.

Disclosure is consensual. The device issues a join request. Home governance approves using local policy. The authority accepts the capability bundle. This sequence ensures all participation is explicit.

11. V1 Constraints

For the initial release, the model is simplified with three constraints.

Each user is a member of exactly one home. This eliminates multi-membership complexity and allows core infrastructure to stabilize.

Each home has a maximum of 8 participants. This human-scale limit enables strong community bonds and manageable governance.

Each home may join a maximum of 4 neighborhoods. This limits 1-hop graph complexity and effect delegation routing overhead.

12. Infrastructure Roles

Homes and neighborhoods provide infrastructure services beyond social organization. The aura-social crate implements neighborhood facts, materialized views, and neighborhood-derived candidate production. Final route or retrieval choice belongs to the runtime.

12.1 Neighborhood Plane Responsibilities

The Neighborhood Plane provides broad locality-scoped candidate pools:

  • neighborhood-derived Establish and Move candidates
  • neighborhood-only Hold candidates for availability and deferred delivery
  • locality-scoped storage and relay budgeting
  • governance and moderation state for homes and neighborhoods

These outputs are permit and candidate inputs. They are not route commitments. For Hold, the neighborhood scope is the whole admissible interface while the runtime chooses a bounded rotating subset of holders inside that scope and keeps retention treatment uniform across deposits.

12.2 Web of Trust Plane Responsibilities

The Web of Trust Plane provides trust evidence:

  • bilateral friendship state as relational-context facts
  • bounded introduction evidence for introduced candidates
  • permit input for Establish and Move
  • bootstrap and accountability weight for trusted providers

Direct friendship is authoritative shared state. Friend-of-friend is local derivation or bounded introduction evidence. It is not canonical shared graph state.

Trust evidence may affect provider admission, weighting, and accountability preference. It must not create friend-shaped, FoF-shaped, guardian-shaped, or neighborhood-shaped route schemas. The provider should observe only the generic Establish, Move, or Hold service action.

12.3 Plane Fusion

The runtime fuses neighborhood candidates, web-of-trust evidence, descriptor views, and health data into provider selection:

InputOwned byPurpose
Neighborhood facts and locality classificationaura-socialneighborhood-scoped permit and candidate production
Friend lifecycle and introduction evidenceaura-relationaltrust evidence provenance
Descriptor snapshotsaura-agent runtime cacheconnectivity and service advertisements
Final provider selectionaura-agentruntime-local permit view and route choice

This split prevents social-role labels from becoming wire-visible service classes.

Neighborhood-only and WoT-assisted candidate production must therefore emit the same Establish and Move descriptor, path, and envelope shapes. The only allowed differences are trust-evidence provenance and runtime-local weighting or selection state. Shared schemas must not grow neighborhood-, friend-, or FoF-specific variants.

12.4 Relay Selection

The SocialTopology provides neighborhood-derived relay candidate generation:

#![allow(unused)]
fn main() {
let topology = SocialTopology::new(local_authority, home, neighborhoods);
let candidates = topology.build_relay_candidates(&destination, |peer| is_reachable(peer));
}

Reachability checks filter unreachable peers. Candidate provenance may record same-home, neighborhood-hop, or guardian evidence. The runtime may use that provenance during local permit evaluation, but aura-social does not own the final route decision.

See Also

Database Architecture describes fact storage and queries. Transport and Information Flow covers AMP messaging. Authorization describes capability evaluation. Rendezvous Architecture describes rendezvous advertisement and selection boundaries.

Distributed Maintenance Architecture

This document describes distributed maintenance in Aura. It explains snapshots, garbage collection, cache invalidation, OTA upgrades, admin replacement, epoch handling, and backup procedures. All maintenance operations align with the authority and relational context model. All maintenance operations insert facts into appropriate journals. All replicas converge through join-semilattice rules.

1. Maintenance Facts

Maintenance uses facts stored in an authority journal. Facts represent monotone knowledge. Maintenance logic evaluates local predicates over accumulated facts. These predicates implement constraints such as GC eligibility or upgrade readiness. The authoritative schema lives in crates/aura-maintenance/src/facts.rs.

#![allow(unused)]
fn main() {
pub enum MaintenanceFact {
    SnapshotProposed(SnapshotProposed),
    SnapshotCompleted(SnapshotCompleted),
    CacheInvalidated(CacheInvalidated),
    AdminReplacement(AdminReplacement),
    ReleaseDistribution(ReleaseDistributionFact),
    ReleasePolicy(ReleasePolicyFact),
    UpgradeExecution(UpgradeExecutionFact),
}
}

This fact model defines snapshot, cache, release-distribution, policy-publication, and scoped upgrade-execution events. Each fact is immutable and merges by set union. Devices reduce maintenance facts with deterministic rules.

2. Snapshots and Garbage Collection

Snapshots bound storage size. A snapshot proposal announces a target epoch and a digest of the journal prefix. Devices verify the digest. If valid, they contribute signatures. A threshold signature completes the snapshot.

Snapshot completion inserts a SnapshotCompleted fact. Devices prune facts whose epochs fall below the snapshot epoch. Devices prune blobs whose retractions precede the snapshot. This pruning does not affect correctness because the snapshot represents a complete prefix.

DKG transcript blobs follow the same garbage collection fence: once a snapshot is finalized, transcripts with epochs older than the snapshot retention window may be deleted. This keeps long-lived key ceremonies from accumulating unbounded storage while preserving the ability to replay from the latest snapshot.

#![allow(unused)]
fn main() {
pub struct Snapshot {
    pub epoch: Epoch,
    pub commitment: TreeHash32,
    pub roster: Vec<LeafId>,
    pub policies: BTreeMap<NodeIndex, Policy>,
    pub state_cid: Option<TreeHash32>,
    pub timestamp: u64,
    pub version: u8,
}
}

This structure defines the snapshot type from aura_core::tree. Devices fetch the blob when restoring state. Devices hydrate journal state and replay the tail of post-snapshot facts.

3. Cache Invalidation

State mutations publish CacheInvalidated facts. A cache invalidation fact contains cache keys and an epoch floor. Devices maintain local actor-owned maps from keys to epoch floors. A cache entry is valid only when the current epoch exceeds its floor.

Cache invalidation is local. No CRDT cache is replicated. Devices compute validity using meet predicates on epoch constraints. Service caches such as rendezvous descriptor registries remain runtime-local mutable state in aura-agent. Facts invalidate them, but facts do not materialize them.

#![allow(unused)]
fn main() {
pub struct CacheKey(pub String);
}

This structure identifies a cached entry. Devices invalidate cached data when they observe newer invalidation facts.

3.1 Hold Garbage Collection

Hold GC uses the same epoch, invalidation, and storage-pressure vocabulary as journal GC. It does not use the same authority model. Journal GC prunes authoritative shared state after threshold-backed snapshot evidence, while Hold GC prunes opaque custody objects through local provider policy.

Held objects are scoped to the epoch in which they were deposited. Epoch rotation makes prior-epoch objects GC-eligible unless a retrieval capability explicitly spans the new epoch. Retrieval-capability expiration and storage pressure can also make a held object eligible for local eviction.

Eviction priority is epoch and then age. It must not vary by social distance, friendship, home membership, or introduction provenance. Uniform treatment prevents retention behavior from becoming a side channel under onion routing.

4. OTA Upgrades

OTA in Aura separates two concerns:

  • global and eventual release distribution
  • local or scope-bound staging, activation, cutover, and rollback

Aura does not model "the whole network is now in cutover" as a valid primitive. Release propagation is multi-directional and eventual. Hard cutover is meaningful only inside a scope that actually has agreement or a legitimate fence.

4.1 Release Identity and Provenance

#![allow(unused)]
fn main() {
pub struct AuraReleaseProvenance {
    pub source_repo_url: String,
    pub source_bundle_hash: Hash32,
    pub build_recipe_hash: Hash32,
    pub output_hash: Hash32,
    pub nix_flake_hash: Hash32,
    pub nix_flake_lock_hash: Hash32,
}

pub struct AuraReleaseManifest {
    pub series_id: AuraReleaseSeriesId,
    pub release_id: AuraReleaseId,
    pub version: SemanticVersion,
    pub provenance: AuraReleaseProvenance,
    pub artifacts: Vec<AuraArtifactDescriptor>,
    pub compatibility: AuraCompatibilityManifest,
    pub suggested_activation_time_unix_ms: Option<u64>,
}
}

AuraReleaseId is derived from the release series and the full provenance. source_repo_url participates in that derivation, so the declared upstream repository location is part of canonical release identity. Builder authorities may publish deterministic build certificates over the same provenance. TEE attestation is optional hardening, not the source of release identity.

4.2 Policy Surfaces

OTA policy is not one switch. Aura distinguishes:

  • discovery policy: what release authorities, builders, and contexts a device is willing to learn from
  • sharing policy: what manifests, artifacts, certificates, or recommendations it is willing to forward or pin
  • activation policy: what trust, compatibility, health, approval, and fence conditions must hold before local activation

Discovering a release does not imply forwarding it. Forwarding it does not imply activating it.

4.3 Activation Scopes and State

Activation is modeled per scope, not globally.

#![allow(unused)]
fn main() {
pub enum AuraActivationScope {
    DeviceLocal { device_id: DeviceId },
    AuthorityLocal { authority_id: AuthorityId },
    RelationalContext { context_id: ContextId },
    ManagedQuorum {
        context_id: ContextId,
        participants: BTreeSet<AuthorityId>,
    },
}

pub enum ReleaseResidency {
    LegacyOnly,
    Coexisting,
    TargetOnly,
}

pub enum TransitionState {
    Idle,
    AwaitingCutover,
    CuttingOver,
    RollingBack,
}
}

ReleaseResidency describes which release set may currently run in the scope. TransitionState describes whether the scope is stable, waiting on evidence, actively switching, or rolling back.

4.4 Cutover and Rollback

Scoped activation uses journal facts plus local policy evaluation. A scope may move toward cutover only when the relevant evidence is present:

  • manifest and certificate verification
  • compatibility classification
  • staged artifacts
  • local trust policy satisfaction
  • optional local-policy respect for suggested_activation_time_unix_ms
  • threshold approval, if that scope actually supports threshold approval
  • epoch fence, if that scope actually owns the relevant fence
  • health gate checks

Hard-fork behavior is explicit. After local cutover, incompatible new sessions are rejected. In-flight incompatible sessions must drain, abort, or delegate according to policy. If post-cutover validation fails, rollback is deterministic and recorded in UpgradeExecutionFact.

Managed quorum cutover requires explicit approval from the participant set bound into AuraActivationScope::ManagedQuorum. Staged revoked releases are canceled before cutover. Active revoked releases follow the local rollback preference. Automatic rollback queues the revert path immediately. Manual rollback leaves the scope failed until an operator approves rollback.

4.5 Updater / Launcher Boundary

Aura does not rely on in-place self-replacement of the running runtime. Layer 6 owns an updater/launcher control plane that:

  • stages manifests, artifacts, and certificates
  • emits explicit activate/rollback commands
  • records scoped upgrade state
  • restores the previous release deterministically when rollback is required

5. Admin Replacement

Admin replacement uses a maintenance fact. The fact records the old admin, new admin, and activation epoch. Devices use this fact to ignore operations from retired administrators.

#![allow(unused)]
fn main() {
pub struct AdminReplacement {
    pub authority_id: AuthorityId,
    pub old_admin: AuthorityId,
    pub new_admin: AuthorityId,
    pub activation_epoch: Epoch,
}
}

This structure defines an admin replacement. Devices enforce this rule locally. The replacement fact is monotone and resolves disputes using journal evidence.

6. Epoch Handling

Maintenance logic uses identity epochs for consistency. A maintenance session uses a tuple containing the identity epoch and snapshot epoch. A session aborts if the identity epoch advances. Devices retry under the new epoch.

Snapshot completion sets the snapshot epoch equal to the identity epoch. Garbage collection rules use the snapshot epoch to prune data safely. Upgrade fences use the same epoch model to enforce activation.

#![allow(unused)]
fn main() {
use aura_maintenance::MaintenanceEpoch;

pub struct MaintenanceEpoch {
    pub identity_epoch: Epoch,
    pub snapshot_epoch: Epoch,
}
}

This structure captures epoch state for maintenance workflows. Devices use this structure for guard checks.

7. Backup and Restore

Backup uses the latest snapshot and recent journal facts. Devices export an encrypted archive containing the snapshot blob and journal tail. Restore verifies the snapshot signature, hydrates state, and replays the journal tail.

Backups use existing storage and verification effects. No separate protocol exists. Backup correctness follows from snapshot correctness.

8. Automatic Synchronization

Automatic synchronization implements periodic journal replication between devices. The synchronization service coordinates peer discovery, session management, and fact exchange. All synchronization uses the journal primitives described in Journal.

8.1 Peer Discovery and Selection

Devices discover sync peers through the rendezvous system described in Rendezvous Architecture. The peer manager consumes runtime-owned rendezvous descriptor snapshots. The peer manager maintains metadata for each discovered peer. This metadata includes connection state, trust level, sync success rate, and active session count.

#![allow(unused)]
fn main() {
pub struct PeerMetadata {
    pub device_id: DeviceId,
    pub status: PeerStatus,
    pub discovered_at: PhysicalTime,
    pub last_status_change: PhysicalTime,
    pub successful_syncs: u64,
    pub failed_syncs: u64,
    pub average_latency_ms: u64,
    pub last_seen: PhysicalTime,
    pub last_successful_sync: PhysicalTime,
    pub trust_level: u8,
    pub has_sync_capability: bool,
    pub active_sessions: usize,
}
}

This structure tracks peer state for selection decisions. All timestamp fields use PhysicalTime from the unified time system. The peer manager calculates a score for each peer using weighted factors. Trust level contributes 50 percent. Success rate contributes 30 percent. Load factor contributes 20 percent. Higher scores indicate better candidates for synchronization.

Devices select peers when their score exceeds a threshold. Devices limit concurrent sessions per peer. This prevents resource exhaustion. Devices skip peers that have reached their session limit.

8.2 Session Management

The session manager tracks active synchronization sessions. Each session has a unique identifier and references a peer device. Sessions enforce rate limits and concurrency bounds.

#![allow(unused)]
fn main() {
pub struct SessionManager<T> {
    sessions: HashMap<SessionId, SessionState<T>>,
    config: SessionConfig,
    metrics: Option<MetricsCollector>,
    last_cleanup: PhysicalTime,
    session_counter: u64,
}
}

This structure maintains session state. Sessions are indexed by SessionId rather than DeviceId. Configuration is provided via SessionConfig. All timestamp fields use PhysicalTime from the unified time system. Devices close sessions after fact exchange completes. Devices abort sessions when the identity epoch advances. Session cleanup releases resources for new synchronization rounds.

8.3 Rate Limiting and Metrics

The synchronization service enforces rate limits per peer and globally. Rate limiting prevents network saturation. Metrics track sync latency, throughput, and error rates.

Devices record metrics for each sync operation. These metrics include fact count, byte count, and duration. Devices aggregate metrics to monitor service health. Degraded peers receive lower priority in future rounds.

8.4 Integration with Journal Effects

Automatic synchronization uses JournalEffects to read and write facts. The service queries local journals for recent facts. The service sends these facts to peers. Peers merge incoming facts using join-semilattice rules.

All fact validation rules apply during automatic sync. Devices reject invalid facts. Devices do not rollback valid facts already merged. This maintains journal monotonicity.

9. Migration Infrastructure

The migration runtime in aura-agent/src/runtime/migration.rs orchestrates data migrations between protocol versions.

9.1 Migration Trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Migration: Send + Sync {
    fn source_version(&self) -> SemanticVersion;
    fn target_version(&self) -> SemanticVersion;
    fn name(&self) -> &str;
    async fn validate(&self, ctx: &MigrationContext) -> Result<(), MigrationError>;
    async fn execute(&self, ctx: &MigrationContext) -> Result<(), MigrationError>;
    async fn rollback(&self, ctx: &MigrationContext) -> Result<bool, MigrationError> {
        Ok(false) // Default: rollback not supported
    }
}
}

Each migration specifies source and target versions, a name for logging, validate/execute methods, and an optional rollback method. The default rollback implementation returns Ok(false) to indicate rollback is not supported.

9.2 Coordinator API

MethodPurpose
needs_migration(from)Check if upgrade is needed
get_migration_path(from, to)Find ordered migration sequence
migrate(from, to)Execute migrations with validation
validate_migration(from, to)Dry-run validation only

9.3 Migration Guarantees

Migrations are ordered by target version. Each migration runs at most once (idempotent via version tracking). Failed migrations leave the system in a consistent state. Progress is recorded in the journal for auditability.

10. Evolution

Maintenance evolves in phases. Current OTA work focuses on release identity/provenance, scoped activation, deterministic rollback, and Aura-native distribution. Future phases may add richer replicated cache CRDTs, stronger builder attestation, staged rollout tooling, and automatic snapshot triggers.

Future phases build on the same journal schema. Maintenance semantics remain compatible with older releases.

11. Summary

Distributed maintenance uses journal facts to coordinate snapshots, cache invalidation, release distribution, scoped upgrades, and admin replacement. All operations use join-semilattice semantics. All reductions are deterministic. Devices prune storage only after observing snapshot completion. OTA release propagation is eventual, while activation is always local or scope-bound. The system remains consistent across offline and online operation.

User Interface

This document describes the aura-terminal user interface layer. It covers the non-interactive CLI commands and the iocraft-based TUI. It also describes how both frontends share aura-app through AppCore and the reactive signal system.

Demo mode is compiled only with --features development.

Goals and constraints

The CLI and the TUI are thin frontends over AppCore. They should not become alternate application runtimes. They should also avoid owning long-lived domain state.

Both frontends must respect the guard chain, journaling, and effect system boundaries described in Aura System Architecture. The CLI optimizes for scriptability and stable output. The TUI optimizes for deterministic navigation with reactive domain data.

Concepts

  • Command. A user-invoked operation such as aura status or aura chat send.
  • Handler. A CLI implementation function that uses HandlerContext and returns CliOutput.
  • Screen. A routed view that renders domain data and local UI state.
  • Modal. A blocking overlay that captures focus. Modals are queued and only one can be visible at a time.
  • Toast. A transient notification. Toasts are queued and only one can be visible at a time.
  • Signal. A reactive stream of domain values from aura-app.
  • Intent. A journaled application command dispatched through AppCore.dispatch (legacy in the TUI). Most TUI actions are runtime-backed workflows via IoContext.
  • Role/access labels. UI copy should use canonical terms from Theoretical Model: Member, Participant, Moderator, and access levels Full/Partial/Limited.

Running

Aura provides CLI, TUI, and demo execution modes. See Getting Started Guide for running instructions and configuration options.

Architecture overview

The CLI and the TUI share the same backend boundary. Both construct an AppCore value and use it as the primary interface to domain workflows and views. Both also rely on aura-agent for effect handlers and runtime services.

The user interface split is:

  • crates/aura-app provides portable domain logic, reactive state, and signals through AppCore.
  • crates/aura-terminal/src/cli/ defines bpaf parsers and CLI argument types.
  • crates/aura-terminal/src/handlers/ implements CLI commands and shared terminal glue.
  • crates/aura-terminal/src/tui/ implements iocraft UI code and deterministic navigation.

Relationship to aura-app

AppCore is the shared boundary for both frontends. It owns reactive state and provides stable APIs for dispatching intents and reading derived views. Frontends should use the aura_app::ui facade (especially aura_app::ui::signals and aura_app::ui::workflows) as the public API surface.

The frontends use AppCore in two ways:

  • Trigger work by calling AppCore.dispatch(intent) or by calling effect-backed handlers that ultimately produce journaled facts.
  • Read state by reading views or by subscribing to signals for push-based updates.

This split keeps domain semantics centralized. It also makes it possible to reuse the same workflows in multiple user interfaces.

Time system in UI

UI code must never read OS clocks (for example, SystemTime::now() or Instant::now()). All wall-clock needs must flow through algebraic effects (PhysicalTimeEffects via the handler/effect system). Demo mode and relative-time UI (e.g., “Synced Xm ago”) must be driven by runtime time so simulations remain deterministic.

Aura time domains are: PhysicalClock (wall time), LogicalClock (causality), OrderClock (privacy-preserving ordering), and Range (validity windows). When attested time is required, use ProvenancedTime/TimeComparison rather than embedding OS timestamps in UI state.

Ordering across domains must be explicit: use TimeStamp::compare(policy) (never compare raw ms) when you need deterministic ordering across mixed time domains.

Shared infrastructure in aura-terminal

CLI commands are parsed in crates/aura-terminal/src/cli/commands.rs and routed through crates/aura-terminal/src/main.rs. Implementations live under crates/aura-terminal/src/handlers/ and use HandlerContext from crates/aura-terminal/src/handlers/handler_context.rs. Handler functions typically return CliOutput for testable rendering.

The TUI launcher lives in crates/aura-terminal/src/handlers/tui.rs. It sets up tracing and constructs the IoContext and callback registry. The fullscreen stdio policy is defined in crates/aura-terminal/src/handlers/tui_stdio.rs.

CLI execution model

The CLI is request and response. Each command parses arguments, runs one handler, and exits. Long-running commands such as daemon modes should still use the same effect boundaries.

flowchart TD
  A[CLI args] --> P[bpaf parser]
  P --> H[CliHandler]
  H --> HC[HandlerContext]
  HC --> E[Effects and services]
  E --> AC[AppCore state]
  AC --> O[CliOutput render]

This diagram shows the main CLI path from parsing to rendering. Some handlers also read derived state from AppCore after effects complete.

Reactive data model

The reactive system follows a fact-based architecture where typed facts are the source of truth for UI state.

Signals as single source of truth

ReactiveEffects signals (CHAT_SIGNAL, CONTACTS_SIGNAL, RECOVERY_SIGNAL, etc.) are the canonical source for all UI state. They are updated by the ReactiveScheduler processing typed facts from the journal.

flowchart LR
  F[Typed Facts] -->|commit| J[Journal]
  J -->|publish| S[ReactiveScheduler]
  S -->|emit| Sig[Signals]
  Sig --> TUI[TUI Screens]
  Sig --> CLI[CLI Commands]

The data flow is:

  1. Typed facts (aura_journal::fact::Fact) are committed to the journal
  2. ReactiveScheduler processes committed facts via registered ReactiveView implementations
  3. SignalViews (e.g., ContactsSignalView, ChatSignalView) update their internal state and emit snapshots to signals
  4. UI components subscribe to signals and render the current state

This architecture ensures a single source of truth and eliminates dual-write bugs. Code that needs to update UI state must commit facts (production) or emit directly to signals (demo/test).

ViewState is internal

AppCore contains an internal ViewState used for legacy compatibility and non-signal use cases. ViewState changes do not propagate to signals. The signal forwarding infrastructure was removed in favor of scheduler-driven updates.

For compile-time safety, there are no public methods on AppCore to mutate ViewState for UI-affecting state. Code that needs to update what the UI displays must:

  1. Production: Commit facts via RuntimeBridge.commit_relational_facts(). Facts flow through the scheduler to signals.
  2. Demo/Test: Emit directly to signals via ReactiveEffects::emit(). This is explicit and type-safe.

This design prevents the "dual-write" bug class where code updates ViewState expecting UI changes, but signals remain unchanged.

The CLI usually reads state at a point in time. It can still use signals for watch-like commands or daemon commands. When a command needs continuous updates, it should subscribe to the relevant signals and render incremental output.

Reading and subscribing to signals

Signals are accessed through AppCore's ReactiveEffects implementation. Read the current value with read() and subscribe for updates with subscribe().

#![allow(unused)]
fn main() {
// Read current state from signal
let contacts = {
    let core = app_core.read().await;
    core.read(&*CONTACTS_SIGNAL).await.unwrap_or_default()
};

// Subscribe for ongoing updates
let mut stream = {
    let core = app_core.read().await;
    core.subscribe(&*CONTACTS_SIGNAL)
};

while let Ok(state) = stream.recv().await {
    render_contacts(&state);
}
}

For initial render, read the current signal value first to avoid a blank frame. Then subscribe for updates. This pattern is used heavily by TUI screens.

Subscriptions and ownership

Long-lived subscriptions that drive global TUI elements live in crates/aura-terminal/src/tui/screens/app/subscriptions.rs. Screen-local subscriptions should live with the screen module.

Subscriptions should be owned by the component that renders the data. A subscription should not mutate TuiState unless it is updating navigation, focus, or overlay state.

Connection status (peer count)

The footer "connected peers" count is a UI convenience signal. It must represent how many of your contacts are online, not a seeded or configured peer list.

  • Source: CONNECTION_STATUS_SIGNAL (emitted by aura_app::ui::workflows::system::refresh_account()).
  • Contact set: read from CONTACTS_SIGNAL (signal truth), not from ViewState snapshots.
  • Online check: RuntimeBridge::is_peer_online(contact_id) (best-effort). Demo uses a shared in-memory transport. Production can use real transport channel health.

Deterministic UI model

The TUI separates domain state from UI state. Domain state is push-based and comes from aura-app signals. UI state is deterministic and lives in TuiState.

Navigation, focus, input buffers, modal queues, and toast queues are updated by a pure transition function. The entry point is crates/aura-terminal/src/tui/state/mod.rs. The runtime executes TuiCommand values in crates/aura-terminal/src/tui/runtime.rs.

Dispatch bridge

The TUI dispatch path uses IoContext in crates/aura-terminal/src/tui/context/io_context.rs.

Today the TUI uses a runtime-backed dispatch model: IoContext routes EffectCommand to DispatchHelper and OperationalHandler. Most domain-affecting behavior occurs through aura-app workflows that call RuntimeBridge (which commits facts and drives signals).

flowchart TD
  UI[User input] --> SM[TuiState transition]
  SM -->|TuiCommand::Dispatch| IO[IoContext dispatch]
  IO --> DH[DispatchHelper]
  DH --> OP[OperationalHandler]
  OP --> W[Workflows / RuntimeBridge]
  W --> J[Commit facts]
  J --> S[Emit signals]
  S --> UI2[Screens subscribe]

This diagram shows the primary TUI dispatch path. Operational commands may also emit operational signals such as SYNC_STATUS_SIGNAL, CONNECTION_STATUS_SIGNAL, and ERROR_SIGNAL.

Screens, modals, and callbacks

The root iocraft component is in crates/aura-terminal/src/tui/screens/app/shell.rs. Global modals live in crates/aura-terminal/src/tui/screens/app/modal_overlays.rs. Long-lived signal subscriptions for the shell live in crates/aura-terminal/src/tui/screens/app/subscriptions.rs.

Modals and toasts are routed through explicit queues in TuiState. The modal enum is QueuedModal in crates/aura-terminal/src/tui/state/modal_queue.rs. Avoid per-modal visible flags.

Invitation codes are managed from the Contacts workflow (modals), not via a dedicated routed Invitations screen.

Callbacks are registered in crates/aura-terminal/src/tui/callbacks/. Asynchronous results are surfaced through UiUpdate in crates/aura-terminal/src/tui/updates.rs. Prefer subscribing to domain signals when a signal already exists.

Fullscreen I/O policy

Writing to stderr while iocraft is in fullscreen can corrupt the terminal buffer. The TUI redirects stderr away from the terminal while fullscreen is active. Tracing is written to a log file.

The policy is enforced with type-level stdio tokens in crates/aura-terminal/src/handlers/tui_stdio.rs. The token used before fullscreen is consumed while iocraft is running. This prevents accidental println! and eprintln! calls in the fullscreen scope.

This policy aligns with Privacy and Information Flow and Effect System.

Errors and user feedback

Domain and dispatch failures are emitted through aura_app::ui::signals::ERROR_SIGNAL. The app shell subscribes to this signal and renders errors as queued toasts. When the account setup modal is active, errors are routed into the modal instead of creating a toast.

UI-only failures use UiUpdate::OperationFailed. This is used primarily for account file operations that occur before AppCore dispatch.

CLI commands should return errors through TerminalResult and render them through CliOutput. Avoid printing error text directly from deep helper functions. Prefer returning structured error types.

Invariants and common pitfalls

The state machine owns navigation, focus, and overlay visibility. Screen components should render TuiState and should not mutate it directly. They should send events and let the state machine decide transitions.

The domain owns the reactive state. Avoid caching domain data in TuiState. Prefer subscribing to aura-app signals and deriving view props inside the screen component.

Single source of truth invariants

  • Signals are the source of truth for UI state, not ViewState.
  • Facts drive signals in production. Commit facts via RuntimeBridge.
  • Direct emission is only for demo/test scenarios via ReactiveEffects::emit().
  • No ViewState mutation for UI state. AppCore has no public methods to mutate ViewState for UI-affecting state.

Common pitfalls

  • Calling println! and eprintln! while fullscreen is active
  • Storing domain state in TuiState instead of subscribing to signals
  • Adding per-modal visible flags instead of using QueuedModal and the modal queue
  • Using UiUpdate as a general event bus instead of subscribing to signals
  • Expecting ViewState changes to appear in the UI. ViewState does not propagate to signals.
  • Emitting directly to domain signals in production code. Use fact commits instead.

Testing strategy

The CLI should be tested with handler unit tests and structured output assertions. Prefer pure formatting helpers and CliOutput snapshots over stdout capture.

The deterministic boundary for the TUI is the state machine. Prefer unit tests that call transition() directly for navigation and modal behavior. For headless terminal event tests, use TuiRuntime<T> from crates/aura-terminal/src/tui/runtime.rs with a mock TerminalEffects handler.

Code map

crates/aura-terminal/src/
  main.rs
  cli/
    commands.rs
  handlers/
    mod.rs
    handler_context.rs
    tui.rs
    tui_stdio.rs
  tui/
    context/
    screens/
      app/
        shell.rs
        modal_overlays.rs
        subscriptions.rs
    state/
    runtime.rs
    hooks.rs
    effects/
    components/

This map shows the primary module boundaries for the CLI and the TUI. CLI logic should live under handlers/ and cli/. TUI view logic should live under tui/.

Demo mode

Demo mode is under crates/aura-terminal/src/demo/. It compiles only with --features development. Production builds should not require demo-only types or props.

Demo architecture

Demo mode uses the same fact-based pipeline as production where possible:

  • Guardian bindings: Committed as RelationalFact::GuardianBinding facts through RuntimeBridge.commit_relational_facts(). These flow through the scheduler to update CONTACTS_SIGNAL.
  • Chat messages: Emitted directly to CHAT_SIGNAL via ReactiveEffects::emit(). Sealed message facts would require cryptographic infrastructure not available in demo.
  • Recovery approvals: Emitted directly to RECOVERY_SIGNAL. Production would use consensus-based RecoveryGrant facts.

The DemoSignalCoordinator in crates/aura-terminal/src/demo/signal_coordinator.rs handles bidirectional event routing between the TUI and simulated agents (Alice and Carol).

Demo shortcuts

Demo mode supports convenience shortcuts. Invite code entry supports Ctrl+a and Ctrl+l when demo codes are present.

Recovery scenario walkthrough

The CLI can run a complete guardian recovery demo through the simulator. This scenario shows Bob onboarding with guardians, losing his device, and recovering with help from Alice and Carol.

Run the recovery demo from the repository root:

cargo run -p aura-terminal -- scenarios run --directory scenarios/integration --pattern cli_recovery_demo

This command uses the CLI scenario runner plus the simulator to execute the guardian setup and recovery choreography. Logs are written to the scenario runner's output log bundle.

The scenario executes in eight phases:

  1. alice_carol_setup creates Alice and Carol authorities using Ed25519 single-signer mode for initial key generation.
  2. bob_onboarding creates Bob authority (Ed25519 initially), sends guardian requests, and configures threshold 2 (switching to FROST).
  3. group_chat_setup has Alice create a group chat and invite Bob and Carol. Context keys are derived for the chat.
  4. group_messaging sends normal chat messages among all three participants. History is persisted.
  5. bob_account_loss simulates total device loss for Bob. He cannot access his authority.
  6. recovery_initiation has Bob initiate recovery. Alice and Carol validate and approve the request. Guardian approval threshold (2) is met.
  7. account_restoration runs threshold key recovery. Bob's chat history is synchronized back.
  8. post_recovery_messaging has Bob send messages again and see full history. The group remains functional.

New accounts use standard Ed25519 signatures because FROST requires at least 2 signers. Once Bob adds guardians and configures threshold 2, subsequent signing operations use the full FROST threshold protocol. See Cryptographic Architecture for details on signing modes.

Testing

Run tests inside the development shell.

Standard

just test-crate aura-terminal

This runs the aura-terminal test suite in the standard project workflow.

Offline

For offline testing, use the workspace offline mode.

CARGO_NET_OFFLINE=true cargo test -p aura-terminal --tests --offline

This runs the full aura-terminal test suite without network access.

Message status indicators

Chat messages display delivery status and finalization indicators in the message bubble header. Status is derived from the unified consistency metadata types in aura_core::domain.

Status indicator legend

Symbol  Meaning              Color      Animation   Source
────────────────────────────────────────────────────────────────
  ◐     Syncing/Sending      Blue       Pulsing     Propagation::Local
  ◌     Pending              Gray       None        Agreement::Provisional
  ✓     Sent                 Green      None        Propagation::Complete
  ✓✓    Delivered            Green      None        Acknowledgment.acked_by includes recipient
  ✓✓    Read                 Blue       None        ChatFact::MessageRead
  ⚠     Unconfirmed          Yellow     None        Agreement::SoftSafe (pending A3)
  ✗     Failed               Red        None        Propagation::Failed
  ◆     Finalized            Muted      None        Agreement::Finalized

Delivery status lifecycle

Messages progress through the following states:

StatusIconMeaning
SendingMessage being transmitted (Propagation::Local)
SentSynced to all known peers (Propagation::Complete)
Delivered✓✓Recipient acked via transport protocol
Read✓✓ (blue)Recipient viewed message (ChatFact::MessageRead)
FailedSync failed (Propagation::Failed with retry)

Delivery status is derived from OptimisticStatus consistency metadata:

#![allow(unused)]
fn main() {
use aura_core::domain::{Propagation, Acknowledgment, OptimisticStatus};

fn delivery_icon(status: &OptimisticStatus, expected_peers: &[AuthorityId]) -> &'static str {
    match &status.propagation {
        Propagation::Local => "◐",
        Propagation::Syncing { .. } => "◐",
        Propagation::Failed { .. } => "✗",
        Propagation::Complete => {
            // Check if all expected peers have acked
            let delivered = status.acknowledgment
                .as_ref()
                .map(|ack| expected_peers.iter().all(|p| ack.contains(p)))
                .unwrap_or(false);
            if delivered { "✓✓" } else { "✓" }
        }
    }
}
}

The transport layer implements ack tracking via ack_tracked on facts. When ack_tracked = true, recipients send FactAck responses that are recorded in the journal's ack table. Read receipts are semantic (user viewed) and use ChatFact::MessageRead.

Agreement and finalization

Agreement level affects display:

AgreementDisplayMeaning
Provisional (A1)NormalUsable but may change
SoftSafe (A2)⚠ badgeCoordinator-safe with convergence cert
Finalized (A3)◆ badgeConsensus-finalized, durable

The finalization indicator (◆) appears when a message achieves A3 consensus (2f+1 witnesses). This indicates the message is durably committed and cannot be rolled back.

Implementation notes

Status indicators are rendered in MessageBubble (crates/aura-terminal/src/tui/components/message_bubble.rs). The consistency metadata flows from ChatState through the CHAT_SIGNAL to the TUI. The ChatState includes OptimisticStatus for each message, which contains:

  • agreement: Current A1/A2/A3 level
  • propagation: Sync status (Local, Syncing, Complete, Failed)
  • acknowledgment: Which peers have acked (for delivery tracking)

See Operation Categories for the full consistency metadata type definitions and AMP Protocol for the acknowledgment flow.

See also

Test Infrastructure Reference

This document describes the architecture of aura-testkit, the test infrastructure crate that provides fixtures, mock handlers, and verification utilities for testing Aura protocols.

Overview

The aura-testkit crate occupies Layer 8 in the Aura architecture. It provides reusable test infrastructure without containing production code. All test utilities follow effect system guidelines to ensure deterministic execution.

The crate serves three purposes. It provides stateful effect handlers for controllable test environments. It offers fixture builders for consistent test setup. It includes verification utilities for property testing and differential testing.

Stateful Effect Handlers

Stateful effect handlers maintain internal state across calls. They enable deterministic testing by controlling time, randomness, and storage. These handlers implement the same traits as production handlers but store state for inspection and manipulation.

Handler Categories

The stateful_effects module provides handlers for each effect category.

HandlerEffect TraitPurpose
SimulatedTimeHandlerPhysicalTimeEffectsControllable simulated time
MockRandomHandlerRandomCoreEffectsSeeded deterministic randomness
MemoryStorageHandlerStorageEffectsIn-memory storage with inspection
MockJournalHandlerJournalEffectsJournal with fact tracking
MockCryptoHandlerCryptoCoreEffectsCrypto with key inspection
MockConsoleHandlerConsoleEffectsCaptured console output

Time Handler

The SimulatedTimeHandler provides controllable time for tests.

#![allow(unused)]
fn main() {
use aura_testkit::stateful_effects::SimulatedTimeHandler;
use aura_core::effects::PhysicalTimeEffects;

let time = SimulatedTimeHandler::new();
let now = time.physical_time().await?;
time.advance_time(5000);
let later = time.physical_time().await?;
}

This handler starts at the current system time by default. Use SimulatedTimeHandler::new_at_epoch() for tests starting at Unix epoch, or SimulatedTimeHandler::new_with_time(start_ms) for a specific start time. Tests can verify time-dependent behavior without wall-clock delays.

Random Handler

The MockRandomHandler provides seeded randomness for reproducible tests.

#![allow(unused)]
fn main() {
use aura_testkit::stateful_effects::MockRandomHandler;

let random = MockRandomHandler::with_seed(42);
let bytes = random.random_bytes(32).await;
}

Given the same seed, this handler produces identical sequences across runs. This enables deterministic property testing and failure reproduction.

Fixture System

The fixture system provides consistent test environment setup. Fixtures encapsulate common configuration patterns and reduce boilerplate.

TestFixture

The TestFixture type provides a complete test environment.

#![allow(unused)]
fn main() {
use aura_testkit::infrastructure::harness::{TestFixture, TestConfig};

let fixture = TestFixture::new().await?;
let device_id = fixture.device_id();
let context = fixture.context();
}

A fixture creates deterministic identifiers, initializes effect handlers, and provides access to test context. The default configuration suits most unit tests.

TestConfig

Custom configurations enable specialized test scenarios.

#![allow(unused)]
fn main() {
let config = TestConfig {
    name: "threshold_test".to_string(),
    deterministic_time: true,
    capture_effects: true,
    timeout: Some(Duration::from_secs(60)),
};
let fixture = TestFixture::with_config(config).await?;
}

The deterministic_time flag enables StatefulTimeHandler. The capture_effects flag records effect calls for later inspection.

Builder Utilities

Builder functions create test data with deterministic inputs. They live in the builders module.

Account Builders

#![allow(unused)]
fn main() {
use aura_testkit::builders::test_account_with_seed;

let account = test_account_with_seed(42).await;
}

This creates an account with deterministic keys derived from the seed. Multiple calls with the same seed produce identical accounts.

Key Builders

#![allow(unused)]
fn main() {
use aura_testkit::builders::test_key_pair;

let (signing_key, verifying_key) = test_key_pair(1337);
}

Key pairs derive from the provided seed. This enables testing signature verification with known keys.

Identifier Generation

Tests must use deterministic identifiers to ensure reproducibility.

#![allow(unused)]
fn main() {
use aura_core::types::identifiers::AuthorityId;

let auth1 = AuthorityId::from_entropy([1u8; 32]);
let auth2 = AuthorityId::from_entropy([2u8; 32]);
}

Never use Uuid::new_v4() or similar entropy-consuming methods in tests. Incrementing byte patterns create distinct but reproducible identifiers.

Verification Utilities

The verification module provides utilities for property testing and differential testing.

Proptest Strategies

The strategies module defines proptest strategies for Aura types.

#![allow(unused)]
fn main() {
use aura_testkit::verification::strategies::{arb_device_id, arb_account_id, arb_key_pair};
use proptest::prelude::*;

proptest! {
    #[test]
    fn device_id_deterministic(id in arb_device_id()) {
        assert_ne!(id.to_string(), "");
    }

    #[test]
    fn key_pair_valid((sk, vk) in arb_key_pair()) {
        assert_eq!(sk.verifying_key(), vk);
    }
}
}

Available strategies include arb_device_id, arb_account_id, arb_session_id, and arb_key_pair. These generate valid, deterministic instances for property testing.

Lean Oracle

The lean_oracle module provides integration with Lean theorem proofs.

#![allow(unused)]
fn main() {
use aura_testkit::verification::lean_oracle::LeanOracle;

let oracle = LeanOracle::new()?;
let result = oracle.verify_journal_merge(&state1, &state2)?;
}

The oracle invokes compiled Lean code to verify properties. This enables differential testing against proven implementations.

Capability Soundness

The capability_soundness module provides formal verification for capability system properties.

#![allow(unused)]
fn main() {
use aura_testkit::verification::capability_soundness::{
    CapabilitySoundnessVerifier, SoundnessProperty, CapabilityState
};

let mut verifier = CapabilitySoundnessVerifier::with_defaults();
let result = verifier.verify_property(
    SoundnessProperty::NonInterference,
    initial_state
).await?;
assert!(result.holds);
}

The verifier checks five soundness properties: NonInterference, Monotonicity, TemporalConsistency, ContextIsolation, and AuthorizationSoundness. Use verify_all_properties to check all properties at once.

Consensus Testing

The consensus module provides infrastructure for consensus protocol testing.

ITF Loader

The itf_loader module loads ITF traces for replay testing.

#![allow(unused)]
fn main() {
use aura_testkit::consensus::itf_loader::ITFLoader;

let trace = ITFLoader::load("artifacts/traces/consensus_happy_path.itf.json")?;
for state in trace.states {
    // Verify state against implementation
}
}

ITF traces come from Quint model checking. The loader parses them into Rust types for conformance testing.

Reference Implementation

The reference module provides a minimal consensus implementation for differential testing.

#![allow(unused)]
fn main() {
use aura_testkit::consensus::reference::ReferenceConsensus;

let reference = ReferenceConsensus::new(config);
let expected = reference.process_vote(vote)?;
let actual = production_consensus.process_vote(vote)?;
assert_eq!(expected.outcome, actual.outcome);
}

The reference implementation prioritizes clarity over performance. It serves as a specification against which production code is tested.

Mock Runtime Bridge

The MockRuntimeBridge simulates the runtime environment for TUI testing.

#![allow(unused)]
fn main() {
use aura_testkit::mock_runtime_bridge::MockRuntimeBridge;

let bridge = MockRuntimeBridge::new();
bridge.inject_chat_update(chat_state);
bridge.inject_contact_update(contacts);
}

This bridge injects signals that would normally come from the reactive pipeline. It enables testing TUI state machines without a full runtime.

Conformance Framework

The conformance module provides artifact validation for native/WASM parity testing.

Artifact Format

The AuraConformanceArtifactV1 captures execution state for comparison. The type is defined in aura_core::conformance:

#![allow(unused)]
fn main() {
use aura_core::{AuraConformanceArtifactV1, AuraConformanceRunMetadataV1, ConformanceSurfaceName};

let mut artifact = AuraConformanceArtifactV1::new(AuraConformanceRunMetadataV1 {
    target: "native".to_string(),
    profile: "native_coop".to_string(),
    scenario: "test_scenario".to_string(),
    seed: Some(42),
    commit: None,
    async_host_transcript_entries: None,
    async_host_transcript_digest_hex: None,
});

// Insert required surfaces
artifact.insert_surface(ConformanceSurfaceName::Observable, observable_surface);
artifact.insert_surface(ConformanceSurfaceName::SchedulerStep, scheduler_surface);
artifact.insert_surface(ConformanceSurfaceName::Effect, effect_surface);

artifact.validate_required_surfaces()?;
}

Every conformance artifact must capture three surfaces:

SurfacePurposeContent
ObservableProtocol-visible outputsNormalized message contents
SchedulerStepLogical progressionStep index, session state, role progression
EffectEffect envelope traceSequence of effect calls with arguments

Missing surfaces cause validation failure.

Metadata aids debugging but does not affect comparison:

#![allow(unused)]
fn main() {
pub struct AuraConformanceRunMetadataV1 {
    pub target: String,
    pub profile: String,
    pub scenario: String,
    pub seed: Option<u64>,
    pub commit: Option<String>,
    pub async_host_transcript_entries: Option<usize>,
    pub async_host_transcript_digest_hex: Option<String>,
}
}

Effect Envelope Classification

Each effect kind has a comparison class that determines how differences are evaluated:

Effect KindClassComparison Rule
send_decisioncommutativeOrder-insensitive under normalization
invoke_stepcommutativeScheduler interleavings normalized
handle_recvstrictByte-exact match required
handle_choosestrictBranch choice must match
handle_acquirestrictGuard semantics must match
handle_releasestrictGuard semantics must match
topology_eventalgebraicReduced via topology-normal form

The strict class requires exact matches. The commutative class normalizes order before comparison. The algebraic class applies domain-specific reduction before comparison.

New effect kinds must be classified before use:

#![allow(unused)]
fn main() {
use aura_core::conformance::AURA_EFFECT_ENVELOPE_CLASSIFICATIONS;

AURA_EFFECT_ENVELOPE_CLASSIFICATIONS.insert(
    "new_effect_kind",
    ComparisonClass::Strict,
);

aura_core::assert_effect_kinds_classified(&effect_trace)?;
}

Unclassified effect kinds cause conformance checks to fail.

Module Structure

aura-testkit/
├── src/
│   ├── builders/           # Test data builders
│   ├── configuration/      # Test configuration
│   ├── consensus/          # Consensus testing utilities
│   ├── conformance.rs      # Conformance artifact support
│   ├── differential.rs     # Differential testing
│   ├── fixtures/           # Reusable test fixtures
│   ├── foundation.rs       # Core test utilities
│   ├── handlers/           # Mock effect handlers
│   ├── infrastructure/     # Test harness infrastructure
│   ├── mock_effects.rs     # Simple mock implementations
│   ├── stateful_effects/   # Stateful effect handlers
│   └── verification/       # Property testing utilities
├── tests/                  # Integration tests
└── benches/                # Performance benchmarks

See Testing Guide for how to write tests using this infrastructure. See Effect System for effect trait definitions.

Simulation Infrastructure Reference

This document describes the architecture of aura-simulator, the simulation crate that provides deterministic protocol testing through effect handler composition, fault injection, and scenario execution.

Overview

The aura-simulator crate occupies Layer 6 in the Aura architecture. It enables testing distributed protocols under controlled conditions. The simulator uses a handler-based architecture rather than a monolithic simulation engine.

The crate provides four capabilities. It offers specialized effect handlers for simulation control. It includes a middleware system for fault injection. It supports TOML-based scenario definitions. It integrates with Quint for model-based testing.

Simulation Modes

The simulator provides two execution modes. TOML scenarios are human-written declarative test scripts that specify operations, assertions, and fault injection. Quint actions are model-generated traces from formal specifications that exercise state-space coverage.

Simulation is an alternate runtime substrate for testing and verification. It does not replace the harness, which executes real frontends. Shared semantic contracts live in aura-app and are consumed by both simulation and harness execution.

Handler-Based Architecture

The simulator composes effect handlers rather than wrapping them in a central engine. Each simulated participant uses its own handler instances. This approach aligns with Aura's stateless effect architecture.

graph TD
    A[Protocol Code] --> B[Effect Traits]
    B --> C[SimulationTimeHandler]
    B --> D[SimulationFaultHandler]
    B --> E[Other Handlers]
    C --> F[Simulated State]
    D --> F
    E --> F

Handlers implement effect traits from aura-core. Protocol code calls effect methods without knowing whether handlers are production or simulation instances.

Simulation Handlers

SimulationTimeHandler

This handler provides deterministic time control.

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationTimeHandler;
use aura_core::effects::PhysicalTimeEffects;
use std::time::Duration;

let mut time = SimulationTimeHandler::new();
time.jump_to_time(Duration::from_secs(10));
let now = time.physical_time().await?;
}

Time starts at zero and advances only through explicit jump_to_time calls or sleep_ms invocations. The sleep_ms method returns immediately after advancing simulated time by the scaled duration. This enables testing timeout behavior without wall-clock delays. Use set_acceleration to adjust time scaling.

SimulationFaultHandler

This handler injects faults into protocol execution.

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationFaultHandler;
use aura_core::{AuraFault, AuraFaultKind, FaultEdge};
use std::time::Duration;

let faults = SimulationFaultHandler::new(42); // seed for determinism

// Inject a message delay fault
let delay_fault = AuraFault {
    fault: AuraFaultKind::MessageDelay { delay_ms: 200 },
    edge: FaultEdge::new("*", "*"),
};
faults.inject_fault(delay_fault, Some(Duration::from_secs(60)))?;

// Inject a message drop fault
let drop_fault = AuraFault {
    fault: AuraFaultKind::MessageDrop { probability: 0.1 },
    edge: FaultEdge::new("*", "*"),
};
faults.inject_fault(drop_fault, None)?; // permanent
}

Fault types include MessageDelay, MessageDrop, MessageCorruption, NodeCrash, NetworkPartition, FlowBudgetExhaustion, and JournalCorruption. Faults can be temporary (with duration) or permanent. The handler implements ChaosEffects for async fault injection.

SimulationScenarioHandler

This handler manages scenario-driven testing.

#![allow(unused)]
fn main() {
use aura_simulator::handlers::{
    SimulationScenarioHandler,
    ScenarioDefinition,
    TriggerCondition,
    InjectionAction,
};

let mut scenarios = SimulationScenarioHandler::new();
scenarios.add_scenario(ScenarioDefinition {
    name: "partition".to_string(),
    trigger: TriggerCondition::AfterTime(Duration::from_secs(5)),
    action: InjectionAction::PartitionNetwork {
        group_a: vec![device1, device2],
        group_b: vec![device3],
    },
});
}

Scenarios define triggered actions based on time or protocol state. They enable testing recovery from transient failures.

SimulationEffectComposer

This type composes handlers into complete simulation environments.

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationEffectComposer;
use aura_core::DeviceId;

let device_id = DeviceId::new_from_entropy([1u8; 32]);
let composer = SimulationEffectComposer::for_testing(device_id).await?;
let env = composer
    .with_time_control()
    .with_fault_injection()
    .build()?;
}

The composer provides a builder pattern for handler configuration. It produces an effect system instance suitable for simulation.

TOML Scenario System

TOML scenarios provide human-readable integration tests with fault injection.

File Format

Scenario files live in the scenarios/ directory.

[metadata]
name = "dkd_basic_derivation"
description = "Basic P2P deterministic key derivation"
version = "1.0"

[[phases]]
name = "setup"
actions = [
    { type = "create_participant", id = "alice" },
    { type = "create_participant", id = "bob" },
]

[[phases]]
name = "derivation"
actions = [
    { type = "run_choreography", choreography = "p2p_dkd", participants = ["alice", "bob"] },
]

[[phases]]
name = "verification"
actions = [
    { type = "verify_property", property = "derived_keys_match" },
]

[[properties]]
name = "derived_keys_match"
property_type = "safety"
expression = "alice.derived_key == bob.derived_key"

Each scenario has metadata, ordered phases, and property definitions. Phases contain action sequences that execute in order.

Action Types

ActionParametersDescription
create_participantidCreate a simulated participant
run_choreographychoreography, participantsExecute a choreographic protocol
verify_propertypropertyCheck a named property
simulate_data_lossparticipant, percentageDelete random stored data
apply_network_conditioncondition, durationApply network fault
advance_timedurationAdvance simulated time

Execution

The SimulationScenarioHandler executes TOML scenarios.

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationScenarioHandler;

let handler = SimulationScenarioHandler::new();
let result = handler.execute_file("scenarios/core_protocols/dkd_basic.toml").await?;
assert!(result.all_properties_passed());
}

Execution proceeds phase by phase. Failures stop execution and report the failing action.

Configuration System

The simulator uses configuration types from aura_simulator::types.

SimulatorConfig

#![allow(unused)]
fn main() {
use aura_simulator::types::{SimulatorConfig, NetworkConfig};
use aura_core::DeviceId;

let config = SimulatorConfig {
    seed: 42,
    deterministic: true,
    time_scale: 1.0,
    network: Some(NetworkConfig {
        latency: std::time::Duration::from_millis(50),
        packet_loss_rate: 0.02,
        bandwidth_bps: Some(1_000_000),
    }),
    ..Default::default()
};
}

NetworkConfig

Network configuration controls simulated network conditions.

FieldTypeDescription
latencyDurationBase network latency
packet_loss_ratef64Probability of dropping messages
bandwidth_bpsOption<u64>Bytes per second limit

SimulatorContext

The SimulatorContext tracks execution state during simulation runs.

#![allow(unused)]
fn main() {
use aura_simulator::types::SimulatorContext;

let context = SimulatorContext::new("scenario_id".into(), "run_123".into())
    .with_seed(42)
    .with_participants(3, 2)
    .with_debug(true);

println!("Tick: {}", context.tick);
println!("Timestamp: {:?}", context.timestamp);
}

The context advances through advance_tick and advance_time methods.

Async Host Boundary

The AsyncSimulatorHostBridge provides an async request/resume interface for Telltale integration.

Design

#![allow(unused)]
fn main() {
use aura_simulator::{AsyncHostRequest, AsyncSimulatorHostBridge};

let mut host = AsyncSimulatorHostBridge::new(42);
host.submit(AsyncHostRequest::VerifyAllProperties);
let entry = host.resume_next().await?;
}

The bridge maintains deterministic ordering through FIFO processing and monotone sequence IDs.

Determinism Constraints

The async host boundary enforces several constraints. Requests process in submission order. Each request receives a unique monotone sequence ID. No wall-clock time affects host decisions. Transcript entries enable replay comparison.

Transcript Artifacts

Adaptive Privacy Phase 6 Evidence

Phase 6 adaptive-privacy validation uses one deterministic artifact lane instead of ad hoc local sweeps. Run:

just ci-adaptive-privacy-tuning

That lane writes the canonical archive under artifacts/adaptive-privacy/phase6/ with:

  • tuning_report.json for the provisional-vs-fixed policy comparison
  • matrix_results.json for the canonical Phase 6 validation matrix
  • control-plane/ telltale-backed parity reports for anonymous path establishment and reply-block accountability
  • parity/report.json for the generic telltale parity lane used by the same archive contract

These artifacts are the evidence source for tuned adaptive-privacy constants. They are observational outputs, not new semantic truth.

#![allow(unused)]
fn main() {
use aura_simulator::AsyncHostTranscriptEntry;

let entry = host.resume_next().await?;
assert_eq!(entry.sequence, 0);
assert!(entry.request.is_verify_properties());
}

Transcript entries record request/response pairs. They enable sync-versus-async host parity testing.

Factory Abstraction

The SimulationEnvironmentFactory trait decouples simulation from effect system internals.

#![allow(unused)]
fn main() {
use aura_core::effects::{SimulationEnvironmentFactory, SimulationEnvironmentConfig};

let config = SimulationEnvironmentConfig {
    seed: 42,
    authority_id,
    device_id: Some(device_id),
    test_mode: true,
};
let effects = factory.create_simulation_environment(config).await?;
}

This abstraction enables stable simulation code across effect system changes. Only the factory implementation requires updates when internals change.

Quint Integration

The quint module provides integration with Quint formal specifications. See Formal Verification Reference for complete details.

Telltale Parity Integration

The simulator exposes telltale parity as an artifact-level boundary. The boundary lives in aura_simulator::telltale_parity. Default simulator execution does not run the protocol machine directly.

Entry Points

Use TelltaleParityInput and TelltaleParityRunner when both baseline and candidate artifacts are already in memory. Use run_telltale_parity_file_lane with TelltaleParityFileRun for file-driven CI workflows.

The supported file lanes accept baseline and candidate artifact paths, a comparison profile, an output report path, and upstream Telltale run sidecars. They emit one stable JSON report artifact whose semantic surface is rooted in the upstream run context rather than Aura-local fallback summaries.

Protocol-critical control-plane lifecycles should use the dedicated telltale control-plane lanes instead of reimplementing the same ownership/timeout/replay lifecycle in Aura-local simulator scenario state. Use run_telltale_control_plane_file_lane with TelltaleControlPlaneFileRun for:

  • anonymous_path_establish
  • reply_block_accountability

Those simulator lanes correspond to the current adaptive-privacy control-plane protocol inventory in aura-agent:

  • "AnonymousPathEstablishProtocol"
  • "MoveReceiptReplyBlockProtocol"
  • "HoldDepositReplyBlockProtocol"
  • "HoldRetrievalReplyBlockProtocol"
  • "HoldAuditReplyBlockProtocol"

Bootstrap and stale-node re-entry are not part of these lanes because they remain runtime-local bootstrap/hint logic rather than canonical multi-party control protocols.

Canonical Surface Mapping

Telltale parity lanes use one canonical mapping:

Telltale Event FamilyAura SurfaceNormalization
observableobservableidentity
scheduler_stepscheduler_steptick normalization
effecteffectenvelope classification

Reports use schema aura.telltale-parity.report.v1.

For Telltale 11-backed lanes, Aura can also invoke an upstream simulator runner first and attach the candidate run sidecar automatically. Use run_telltale_parity_with_runner(...) or run_telltale_control_plane_with_runner(...) and configure the runner command with AURA_TELLTALE_SIMULATOR_RUNNER when the executable is not on the shell search path. The supported report lanes still expect a baseline upstream run sidecar so the report can compare theorem, scheduler, environment, and reconfiguration context across both sides.

Expected Outputs

The simulator telltale parity lane writes:

artifacts/telltale-parity/report.json

The report includes:

  • comparison classification (strict or envelope_bounded)
  • first mismatch surface
  • first mismatch step index
  • full differential comparison payload
  • semantic summary derived from authoritative upstream Telltale 11 run context
  • upstream Telltale 11 run context for both sides, with optional decision and sweep attachments when those sidecars are provided

Environment Bridge

Aura now exposes a small environment bridge for simulator-local state that is being migrated toward Telltale 11 terminology. The current migrated slice is:

  • adaptive-privacy movement profiles -> mobility profiles
  • sync opportunities -> link-admission observations
  • provider saturation -> node-capability observations

Scenario runs now persist that bridge surface as first-class artifacts under the simulator artifact root:

<artifacts_dir>/scenario-runs/<scenario_slug>-seed-<seed>/environment_snapshot.json
<artifacts_dir>/scenario-runs/<scenario_slug>-seed-<seed>/environment_trace.json
<artifacts_dir>/scenario-runs/<scenario_slug>-seed-<seed>/environment_overlay.json

Read the written paths from SimulationResults::environment_artifacts. The bridge remains the single Aura-owned place where migrated mobility, link-admission, and node-capability decisions are materialized before handler state keeps its richer Aura-local bookkeeping. The optional environment_overlay.json supplement carries Aura-only dimensions such as provider heterogeneity, admission pressure, topology churn, and interference patterns without expanding the core bridge vocabulary.

Experiment Surfaces

Aura now layers its comparative experiment lanes on top of the published Telltale simulator harness and sweep machinery instead of adding a second local manifest format.

Use aura_simulator::run_adaptive_privacy_policy_sweep(...) together with aura_simulator::archive_from_sweep(...) to produce AuraSweepArchiveV1 artifacts for adaptive-privacy comparisons while preserving the shared Telltale sweep manifest shape internally. Use aura_simulator::compare_policy_sweeps(...) to emit AuraPolicyDiffReportV1, which compares two sweep runs with the shared Telltale diff logic.

For theorem-aware failure evidence, use aura_simulator::counterexample_from_parity_report(...) or aura_simulator::counterexample_from_control_plane_report(...). These wrappers emit AuraCounterexampleReportV1, preserving the shared Telltale counterexample witness internally and annotating whether the mismatch is schedule noise or safety-visible divergence.

For reusable study bundles, use aura_simulator::run_suite_catalog(...) and aura_simulator::compare_suite_catalogs(...). Aura owns the suite catalog choices, while the shared Telltale harness remains the execution engine and AuraSuiteTournamentReportV1 preserves the shared sweep-derived comparison shape internally.

ITF Trace Format

ITF (Informal Trace Format) traces come from Quint model checking. Each trace captures a sequence of states and transitions.

{
  "#meta": {
    "format": "ITF",
    "source": "quint",
    "version": "1.0"
  },
  "vars": ["phase", "participants", "messages"],
  "states": [
    {
      "#meta": { "index": 0 },
      "phase": "Setup",
      "participants": [],
      "messages": []
    },
    {
      "#meta": { "index": 1, "action": "addParticipant" },
      "phase": "Setup",
      "participants": ["alice"],
      "messages": []
    }
  ]
}

Each state represents a model state. Transitions between states correspond to actions.

ITF traces capture non-deterministic choices for replay:

{
  "#meta": {
    "index": 3,
    "action": "selectLeader",
    "nondet_picks": { "leader": "bob" }
  }
}

The nondet_picks field records choices made by Quint. Replay uses these values to seed RandomEffects.

ITFLoader

#![allow(unused)]
fn main() {
use aura_simulator::quint::itf_loader::ITFLoader;

let trace = ITFLoader::load("trace.itf.json")?;
for (i, state) in trace.states.iter().enumerate() {
    let action = state.meta.action.as_deref();
    let picks = &state.meta.nondet_picks;
}
}

The loader validates trace format and extracts typed state.

GenerativeSimulator

#![allow(unused)]
fn main() {
use aura_simulator::quint::generative_simulator::GenerativeSimulator;

let simulator = GenerativeSimulator::new(config)?;
let result = simulator.replay_trace(&trace).await?;
}

The generative simulator replays ITF traces through real effect handlers.

Module Structure

aura-simulator/
├── src/
│   ├── handlers/           # Simulation effect handlers
│   │   ├── time_control.rs
│   │   ├── fault_simulation.rs
│   │   ├── scenario.rs
│   │   └── effect_composer.rs
│   ├── middleware/         # Effect interception
│   ├── quint/              # Quint integration
│   │   ├── itf_loader.rs
│   │   ├── action_registry.rs
│   │   ├── state_mapper.rs
│   │   └── generative_simulator.rs
│   ├── scenarios/          # Scenario execution
│   ├── async_host.rs       # Async host boundary
│   └── testkit_bridge.rs   # Testkit integration
├── tests/                  # Integration tests
└── examples/               # Usage examples

See Simulation Guide for how to write simulations. See Testing Guide for conformance testing. See Formal Verification Reference for Quint integration details.

Formal Verification Reference

This document describes the formal verification infrastructure that provides mathematical guarantees for Aura protocols through Quint model checking, Lean theorem proving, and Telltale session type verification.

Overview

Aura uses three complementary verification systems. Quint provides executable state machine specifications with model checking. Lean provides mathematical theorem proofs. Telltale provides session type guarantees for choreographic protocols.

The systems form a trust chain. Quint specifications define correct behavior. Lean proofs verify mathematical properties. Telltale ensures protocol implementations match session type specifications.

Verification Boundary

Aura separates domain proof ownership from runtime parity checks.

Verification SurfacePrimary ToolsGuarantee ClassOwnership
Consensus and CRDT domain propertiesQuint + Leanmodel and theorem correctnessverification/quint/ and verification/lean/
Runtime execution conformanceTelltale parity + conformance artifactsimplementation parity under declared envelopesaura-agent, aura-simulator, aura-testkit
Bridge consistencyaura-quint bridge pipelinecross-validation between model checks and certificatesaura-quint

Telltale runtime parity does not replace domain theorem work. It validates runtime behavior against admitted profiles and artifact envelopes.

Assurance Summary

This architecture provides five assurance classes.

  1. Boundary assurance. Domain theorem claims and runtime parity claims are separated. This reduces proof-surface ambiguity.

  2. Runtime parity assurance. Telltale parity lanes compare runtime artifacts with deterministic profiles. This provides replayable evidence for conformance under declared envelopes.

  3. Bridge consistency assurance. Bridge pipelines check model-check outcomes against certificate outcomes. This detects drift between proof artifacts and executable checks.

  4. CI gate assurance. Parity and bridge lanes run as CI gates. This prevents silent regression of conformance checks.

  5. Coverage drift assurance. Coverage documentation is validated against repository state by script checks. This prevents long-term drift between claims and implementation.

Limits remain explicit. Parity success is not a replacement for new Quint or Lean domain proofs. Parity checks are coverage-bounded by scenarios, seeds, and artifact surfaces.

Quint Architecture

Quint specifications live in verification/quint/. They define protocol state machines and verify properties through model checking with Apalache.

Directory Structure

verification/quint/
├── core.qnt               # Shared runtime utilities
├── authorization.qnt      # Guard chain security
├── recovery.qnt           # Guardian recovery
├── consensus/             # Consensus protocol specs
│   ├── core.qnt
│   ├── liveness.qnt
│   └── adversary.qnt
├── journal/               # Journal CRDT specs
│   ├── core.qnt
│   ├── counter.qnt
│   └── anti_entropy.qnt
├── keys/                  # Key management specs
│   └── dkg.qnt
├── sessions/              # Session management specs
│   ├── core.qnt
│   └── groups.qnt
├── harness/               # Simulator harnesses
├── tui/                   # TUI state machine
└── traces/                # Generated ITF traces

Each specification focuses on a single protocol or subsystem.

Specification Pattern

Specifications follow a consistent structure.

module protocol_example {
    // Type definitions
    type Phase = Setup | Active | Completed | Failed
    type State = { phase: Phase, data: Data }

    // State variables
    var state: State

    // Initial state
    action init = {
        state' = { phase: Setup, data: emptyData }
    }

    // State transitions
    action transition(input: Input): bool = all {
        state.phase != Completed,
        state.phase != Failed,
        state' = computeNextState(state, input)
    }

    // Invariants
    val safetyInvariant = state.phase != Failed or hasRecoveryPath(state)
}

Actions define state transitions. Invariants define properties that must hold in all reachable states.

Harness Modules

Harness modules provide standardized entry points for simulation.

module harness_example {
    import protocol_example.*

    action register(id: Id): bool = init
    action step(input: Input): bool = transition(input)
    action complete(): bool = state.phase == Completed
}

Harnesses enable Quint simulation and ITF trace generation.

Available Specifications

SpecificationPurposeKey Invariants
consensus/core.qntFast-path consensusUniqueCommitPerInstance, CommitRequiresThreshold
consensus/liveness.qntLiveness propertiesProgressUnderSynchrony, RetryBound
consensus/adversary.qntByzantine toleranceByzantineThreshold, EquivocationDetected
journal/core.qntJournal CRDTNonceUnique, FactsOrdered
journal/anti_entropy.qntSync protocolFactsMonotonic, EventualConvergence
authorization.qntGuard chainNoCapabilityWidening, ChargeBeforeSend

Lean Architecture

Lean proofs live in verification/lean/. They provide mathematical verification of safety properties.

Directory Structure

verification/lean/
├── lakefile.lean          # Build configuration
├── Aura/
│   ├── Assumptions.lean   # Cryptographic axioms
│   ├── Types.lean         # Core type definitions
│   ├── Types/
│   │   ├── ByteArray32.lean
│   │   └── OrderTime.lean
│   ├── Proofs/
│   │   ├── Consensus/
│   │   │   ├── Agreement.lean
│   │   │   ├── Validity.lean
│   │   │   ├── Equivocation.lean
│   │   │   ├── Liveness.lean
│   │   │   ├── Evidence.lean
│   │   │   ├── Adversary.lean
│   │   │   └── Frost.lean
│   │   ├── Journal.lean
│   │   ├── FlowBudget.lean
│   │   ├── GuardChain.lean
│   │   ├── KeyDerivation.lean
│   │   ├── TimeSystem.lean
│   │   └── ContextIsolation.lean
│   └── Runner.lean        # CLI for differential testing

Axioms

Cryptographic assumptions appear as axioms in Assumptions.lean.

axiom frost_threshold_unforgeability :
  ∀ (k n : Nat) (shares : List Share),
    k ≤ shares.length →
    shares.length ≤ n →
    validShares shares →
    unforgeable (aggregate shares)

axiom hash_collision_resistance :
  ∀ (a b : ByteArray), hash a = hash b → a = b

Proofs that depend on these assumptions are sound under standard cryptographic hardness assumptions.

The consensus proofs also depend on domain-level axioms for signature binding. These axioms establish that valid signatures bind to unique results. See verification/lean/Aura/Assumptions.lean for the full axiom reduction analysis.

Claims Bundles

Related theorems group into claims bundles.

structure ValidityClaims where
  commit_has_threshold : ∀ c, isCommit c → hasThreshold c
  validity : ∀ c, isCommit c → validPrestate c
  distinct_signers : ∀ c, isCommit c → distinctSigners c.shares

def validityClaims : ValidityClaims := {
  commit_has_threshold := Validity.commit_has_threshold
  validity := Validity.validity
  distinct_signers := Validity.distinct_signers
}

Bundles provide easy access to related proofs.

Proof Status

ModuleStatusNotes
ValidityCompleteAll theorems proven
EquivocationCompleteDetection soundness/completeness
EvidenceCompleteCRDT properties
FrostCompleteAggregation properties
AgreementUses axiomDepends on FROST uniqueness
LivenessAxiomsTiming assumptions
JournalCompleteCRDT semilattice properties

Canonical Lean API Types

Legacy Lean verification compatibility types have been removed from aura_testkit::verification and aura_testkit::verification::lean_oracle. Use the canonical full-fidelity types and methods.

Type Mapping

Legacy TypeCanonical Type
FactLeanFact
ComparePolicyLeanComparePolicy
TimeStampLeanCompareTimeStamp (compare payloads) or LeanTimeStamp (journal facts)
OrderingLeanTimestampOrdering
FlowChargeInput/FlowChargeResultLeanFlowChargeInput/LeanFlowChargeResult
TimestampCompareInput/TimestampCompareResultLeanTimestampCompareInput/LeanTimestampCompareResult

Method Mapping

Legacy MethodCanonical Method
verify_mergeverify_journal_merge
verify_reduceverify_journal_reduce
verify_chargeverify_flow_charge
verify_compareverify_timestamp_compare

aura-quint Crate

The aura-quint crate provides Rust integration with Quint specifications.

QuintRunner

The runner executes Quint verification and parses results.

#![allow(unused)]
fn main() {
use aura_quint::runner::{QuintRunner, RunnerConfig};
use aura_quint::PropertySpec;

let config = RunnerConfig {
    default_timeout: Duration::from_secs(60),
    max_steps: 1000,
    generate_counterexamples: true,
    ..Default::default()
};
let mut runner = QuintRunner::with_config(config)?;
let spec = PropertySpec::invariant("UniqueCommitPerInstance");
let result = runner.verify_property(&spec).await?;
}

The runner provides verify_property for invariant checking and simulate for trace-based testing. It caches results and can generate counterexamples.

Property Evaluator

The evaluator checks properties against Rust state.

#![allow(unused)]
fn main() {
use aura_quint::evaluator::PropertyEvaluator;

let evaluator = PropertyEvaluator::new();
let result = evaluator.evaluate("chargeBeforeSend", &state)?;
}

Properties translate from Quint syntax to Rust predicates.

Property Categories

The evaluator classifies properties by keyword patterns.

CategoryKeywordsExamples
Authorizationgrant, permit, guardguardChainOrder
Budgetbudget, charge, spentchargeBeforeSend
Integrityattenuation, signatureattenuationOnlyNarrows
Livenesseventually, progresseventualConvergence
Safetynever, always, invariantuniqueCommit

Categories help organize verification coverage reports.

Telltale Bridge Data Contract

aura-quint defines a versioned interchange schema for bridge workflows.

TypePurpose
BridgeBundleV1Top-level bundle with schema_version = "aura.telltale-bridge.v1"
SessionTypeInterchangeV1Session graph exchange
PropertyInterchangeV1Quint, Telltale, and Lean property exchange
ProofCertificateV1Proof or model-check evidence

Use this schema as the canonical data contract when exporting Quint sessions to Telltale formats or importing Telltale and Lean properties into Quint harnesses.

Quint-Lean Correspondence

This section maps Quint model invariants to Lean theorem proofs, providing traceability between model checking and formal proofs.

Types Correspondence

Quint TypeLean TypeRust Type
ConsensusIdAura.Domain.Consensus.Types.ConsensusIdconsensus::types::ConsensusId
ResultIdAura.Domain.Consensus.Types.ResultIdconsensus::types::ResultId
PrestateHashAura.Domain.Consensus.Types.PrestateHashconsensus::types::PrestateHash
AuthorityIdAura.Domain.Consensus.Types.AuthorityIdcore::AuthorityId
ShareDataAura.Domain.Consensus.Types.ShareDataconsensus::types::SignatureShare
ThresholdSignatureAura.Domain.Consensus.Types.ThresholdSignatureconsensus::types::ThresholdSignature
CommitFactAura.Domain.Consensus.Types.CommitFactconsensus::types::CommitFact
WitnessVoteAura.Domain.Consensus.Types.WitnessVoteconsensus::types::WitnessVote
EvidenceAura.Domain.Consensus.Types.Evidenceconsensus::types::Evidence

Invariant-Theorem Correspondence

Agreement Properties

Quint InvariantLean TheoremStatus
InvariantUniqueCommitPerInstanceAura.Proofs.Consensus.Agreement.agreementproven
InvariantUniqueCommitPerInstanceAura.Proofs.Consensus.Agreement.unique_commitproven
-Aura.Proofs.Consensus.Agreement.commit_determinismproven

Validity Properties

Quint InvariantLean TheoremStatus
InvariantCommitRequiresThresholdAura.Proofs.Consensus.Validity.commit_has_thresholdproven
InvariantSignatureBindsToCommitFactAura.Proofs.Consensus.Validity.validityproven
-Aura.Proofs.Consensus.Validity.distinct_signersproven
-Aura.Proofs.Consensus.Validity.prestate_binding_uniqueproven
-Aura.Proofs.Consensus.Validity.honest_participationproven
-Aura.Proofs.Consensus.Validity.threshold_unforgeabilityaxiom

FROST Integration Properties

Quint InvariantLean TheoremStatus
InvariantSignatureThresholdAura.Proofs.Consensus.Frost.aggregation_thresholdproven
-Aura.Proofs.Consensus.Frost.share_session_consistencyproven
-Aura.Proofs.Consensus.Frost.share_result_consistencyproven
-Aura.Proofs.Consensus.Frost.distinct_signersproven
-Aura.Proofs.Consensus.Frost.share_bindingproven

Evidence CRDT Properties

Quint InvariantLean TheoremStatus
-Aura.Proofs.Consensus.Evidence.merge_comm_votesproven
-Aura.Proofs.Consensus.Evidence.merge_assoc_votesproven
-Aura.Proofs.Consensus.Evidence.merge_idemproven
-Aura.Proofs.Consensus.Evidence.merge_preserves_commitproven
-Aura.Proofs.Consensus.Evidence.commit_monotonicproven

Equivocation Detection Properties

Quint InvariantLean TheoremStatus
InvariantEquivocationDetectedAura.Proofs.Consensus.Equivocation.detection_soundnessproven
InvariantEquivocationDetectedAura.Proofs.Consensus.Equivocation.detection_completenessproven
InvariantEquivocatorsExcludedAura.Proofs.Consensus.Equivocation.exclusion_correctnessproven
InvariantHonestMajorityCanCommitAura.Proofs.Consensus.Equivocation.honest_never_detectedproven
-Aura.Proofs.Consensus.Equivocation.verified_proof_soundproven

Byzantine Tolerance (Adversary Module)

Quint InvariantLean TheoremStatus
InvariantByzantineThresholdAura.Proofs.Consensus.Adversary.adversaryClaims.byzantine_cannot_forgeclaim
InvariantEquivocationDetectedAura.Proofs.Consensus.Adversary.adversaryClaims.equivocation_detectableclaim
InvariantHonestMajorityCanCommitAura.Proofs.Consensus.Adversary.adversaryClaims.honest_majority_sufficientclaim
InvariantEquivocatorsExcludedAura.Proofs.Consensus.Adversary.adversaryClaims.equivocators_excludedclaim
InvariantCompromisedNoncesExcluded-Quint only

Liveness Properties

Quint PropertyLean SupportNotes
InvariantProgressUnderSynchronyAura.Proofs.Consensus.Liveness.livenessClaims.terminationUnderSynchronyaxiom
InvariantByzantineTolerancebyzantine_thresholdaxiom
FastPathProgressCheckAura.Proofs.Consensus.Liveness.livenessClaims.fastPathBoundaxiom
SlowPathProgressCheckAura.Proofs.Consensus.Liveness.livenessClaims.fallbackBoundaxiom
NoDeadlockAura.Proofs.Consensus.Liveness.livenessClaims.noDeadlockaxiom
InvariantRetryBound-Quint model checking only

Module Correspondence

Lean ModuleQuint FileWhat It Proves
Proofs.ContextIsolationauthorization.qnt, leakage.qntContext separation and bridge authorization
Proofs.Consensus.Agreementconsensus/core.qntAgreement safety (unique commits)
Proofs.Consensus.Evidenceconsensus/core.qntCRDT semilattice properties
Proofs.Consensus.Frostconsensus/frost.qntThreshold signature correctness
Proofs.Consensus.Livenessconsensus/liveness.qntSynchrony model axioms
Proofs.Consensus.Adversaryconsensus/adversary.qntByzantine tolerance bounds
Proofs.Consensus.Equivocationconsensus/adversary.qntDetection soundness/completeness

Quint Integration in aura-simulator

The simulator provides deeper Quint integration for model-based testing.

ITFLoader

#![allow(unused)]
fn main() {
use aura_simulator::quint::itf_loader::ITFLoader;

let trace = ITFLoader::load("trace.itf.json")?;
}

The loader parses ITF traces into typed Rust structures.

QuintMappable Trait

Types that map between Quint and Rust implement QuintMappable.

#![allow(unused)]
fn main() {
use aura_core::effects::quint::QuintMappable;

impl QuintMappable for ConsensusState {
    fn from_quint(value: &QuintValue) -> Result<Self> {
        // Parse Quint JSON into Rust type
    }

    fn to_quint(&self) -> QuintValue {
        // Convert Rust type to Quint JSON
    }
}
}

This trait enables bidirectional state mapping.

ActionRegistry

The registry maps Quint action names to Rust handlers.

#![allow(unused)]
fn main() {
use aura_simulator::quint::action_registry::{ActionRegistry, ActionHandler};

let mut registry = ActionRegistry::new();
registry.register("initContext", Box::new(InitContextHandler));
registry.register("submitVote", Box::new(SubmitVoteHandler));

let result = registry.execute("initContext", &params, &effects).await?;
}

Handlers implement Quint actions using real effect handlers.

StateMapper

The mapper converts between Aura and Quint state representations.

#![allow(unused)]
fn main() {
use aura_simulator::quint::state_mapper::StateMapper;

let mapper = StateMapper::default();
let quint_state = mapper.aura_to_quint(&aura_state)?;
let updated_aura = mapper.quint_to_aura(&quint_state)?;
}

Bidirectional mapping enables state synchronization during trace replay.

GenerativeSimulator

The simulator replays ITF traces through real effect handlers.

#![allow(unused)]
fn main() {
use aura_simulator::quint::generative_simulator::{
    GenerativeSimulator,
    GenerativeSimConfig,
};

let config = GenerativeSimConfig {
    max_steps: 1000,
    check_invariants_every: 10,
    seed: Some(42),
};
let simulator = GenerativeSimulator::new(config)?;
let result = simulator.replay_trace(&trace).await?;
}

Replay validates that implementations match Quint specifications.

Telltale Formal Guarantees

Telltale provides session type verification for choreographic protocols.

Session Type Projections

Choreographies project to local session types for each participant.

#![allow(unused)]
fn main() {
#[choreography]
async fn two_party_exchange<A, B>(
    #[role] alice: A,
    #[role] bob: B,
) {
    alice.send(bob, message)?;
    let response = bob.recv(alice)?;
}
}

The macro generates session types that ensure protocol compliance.

Leakage Tracking

The LeakageTracker monitors information flow during protocol execution.

#![allow(unused)]
fn main() {
use aura_mpst::LeakageTracker;

let tracker = LeakageTracker::new(budget);
tracker.record_send(recipient, message_size)?;
let remaining = tracker.remaining_budget();
}

Choreography annotations specify leakage costs. The tracker enforces budgets at runtime.

Guard Annotations

Guards integrate with session types through annotations.

#![allow(unused)]
fn main() {
#[guard_capability("send_message")]
#[flow_cost(100)]
#[journal_facts("MessageSent")]
async fn send_step() {
    // Implementation
}
}

Annotations generate guard chain invocations. The Telltale compiler verifies annotation consistency.

User Flow Harness

This document defines the harness contract for parity-critical user flows. It supplements Testing Guide. Crate placement follows Project Structure.

1. Purpose

aura-harness is the multi-instance orchestration crate for end-to-end Aura validation. It starts local, browser, and SSH-backed instances. It runs shared scenarios against real frontends instead of renderer-specific scripts.

The default correctness lane is the real runtime with real TUI or web surfaces. The harness is not a replacement for simulator, Quint, or unit-level validation. Those systems provide supporting evidence and alternate execution environments.

2. Execution Lanes

The harness defines two execution lanes:

  • Shared semantic lane: submits typed intent commands and waits on typed semantic contracts.
  • Frontend-conformance lane: validates renderer-specific wiring such as PTY keys, DOM selectors, focus movement, and control bindings.

See Testing Guide for lane selection and execution details.

3. Scenario Sources

The scenario source of truth is aura-app::scenario_contract. Scenario taxonomy:

  • Shared semantic scenarios: typed semantic steps, no execution_mode declaration, no renderer-specific mechanics.
  • Frontend-conformance scenarios: typed UiAction mechanics (key presses, text input, modal dismissal), must declare execution_mode explicitly.
  • Compatibility fixtures: quarantined renderer-mechanic coverage only, must declare execution_mode = "compatibility" or execution_mode = "agent".

Inventoried scenarios are classified in scenarios/harness_inventory.toml as shared, TUI conformance, or web conformance.

See Testing Guide for scenario authoring and governance.

4. Backend Model

The harness defines the following backend interfaces:

  • InstanceBackend: lifecycle, health checks, snapshots, log tails, and basic input.
  • RawUiBackend: renderer-driven actions for conformance coverage.
  • SharedSemanticBackend: shared_projection(), submit_semantic_command(), and projection-event waits. Implemented by LocalPtyBackend and PlaywrightBrowserBackend.
  • SshTunnelBackend: orchestration-only (SSH security defaults and tunnel setup).

See Testing Guide for backend implementation.

5. Observation Model

UiSnapshot is the authoritative observation surface for parity-critical flows. Observation contracts:

  • Snapshots carry ProjectionRevision, quiescence state, selections, lists, operations, toasts, and runtime events.
  • Parity-critical waits bind to typed contracts (readiness, visibility, events, quiescence, operation handles, strictly newer projections). Raw text matching and DOM scraping are diagnostics only.
  • Observation paths are side-effect free. Reads do not repair state or retry hidden actions.

See Testing Guide for observation patterns.

6. Semantic Command Plane

Shared commands are typed IntentAction requests (account creation, device enrollment, contact invitations, channel membership, chat sends). Contracts:

  • Each command returns a typed response with submission metadata and an optional operation handle.
  • Post-action waits require a strictly newer authoritative projection or another declared barrier.
  • Unsupported semantic commands fail closed. No silent fallback to renderer-specific behavior.

See Testing Guide for semantic command usage.

7. Scenario Execution

ScenarioExecutor enforces per-step budgets and an optional global budget. Contracts:

  • Shared scenarios must declare convergence barriers before the next typed intent when the flow requires one.
  • The executor records canonical trace events, state transitions, and step metrics.
  • Frontend-conformance scenarios produce traces and diagnostics but are not the primary parity oracle for shared business flows.

See Testing Guide for scenario execution details.

8. Determinism And Replay

Determinism contracts:

  • Configuration validation is config-first: invalid inputs fail before execution starts.
  • Determinism derives from build_seed_bundle() (run seed, scenario seed, fault seed, per-instance seeds). Event streams use monotonically increasing identifiers.
  • Replay bundles store run config, tool API version, action log, routing metadata, and seed bundle. Deterministic shared flows preserve semantic trace shape under identical inputs.

See Testing Guide for determinism and replay details.

9. Runtime Substrate

The harness supports real and simulator runtime substrates. The real substrate is the default lane. The simulator substrate is an alternate deterministic runtime controller for fault injection and transcript capture.

Simulator substrate runs currently support local instances only. Browser instances are not allowed in simulator mode. Shared user-flow correctness still belongs to the real frontend lane even when simulator support is enabled for controlled experiments.

10. Governance And Policy

Harness governance is typed first. aura-harness exposes governance checks for shared scenario contracts, scenario-shape enforcement, barrier legality, user-flow coverage, UI parity metadata, and wrapper integrity.

The main repository policy entry points are Aura's toolkit/xtask checks exposed through just ci-shared-flow-policy, just ci-user-flow-policy, and just ci-harness-matrix-inventory.

Harness mode may add instrumentation and render-stability hooks. It must not change parity-critical business semantics. Allowlisted exceptions must carry owner, justification, and design-note metadata.

11. Boundaries

aura-harness is tooling. It is not the authority for domain semantics, effect traits, or protocol safety rules. Those contracts remain owned by aura-core, aura-app, and the other runtime and specification crates.

The harness drives instances through process boundaries and typed tool surfaces. It must not mutate protocol state out of band. Shared UX identifiers, parity metadata, and observation shapes remain owned by aura-app::ui_contract.

Ownership Model

This document defines Aura's ownership model for authority, mutation, async lifecycle, and terminal failure behavior across the workspace.

It complements System Architecture for the high-level system view. See Effect System for effect boundaries. See Runtime for lifecycle and supervision. See Project Structure for layer placement.

Overview

Aura uses four ownership categories: Pure, MoveOwned, ActorOwned, and Observed. Every parity-critical subsystem, operation, and mutation surface must fit one of these categories. Bugs of the form "multiple layers own the same truth" are architectural violations.

Categories

Pure

Pure code is deterministic and effect-free. Use it for reducers, validators, state machines, fact interpretation, and typed contracts. Pure code may not own long-lived mutable async state, publish semantic lifecycle, or rely on ambient authority.

MoveOwned

MoveOwned code represents exclusive authority through consumed values. Use it for operation handles, owner tokens, delegation records, session handoff objects, and stale-owner invalidation boundaries.

Ownership transfer must consume a handoff object or owner token. Stale holders must become invalid by construction. Direct owner-field rewrites are forbidden where a transfer object is required.

ActorOwned

ActorOwned code owns long-lived mutable async state under one live task. Use it for runtime services, supervisors, maintenance loops, readiness coordinators, lifecycle coordinators, and command ingress loops.

There must be exactly one live owner for the mutable state domain. Mutation happens through typed ingress, not shared mutable access. Long-lived background work must be supervised. Owner death must lead to explicit terminal state, failure, or shutdown.

In practice, Aura's production ActorOwned runtime path is aura-agent:

  • bounded ingress is declared with aura-core::BoundedActorIngress
  • long-lived service ownership is internal to runtime service modules
  • shared actor handles/mailboxes are crate-private runtime internals, not a public API for higher layers
  • raw spawn lives only inside the sanctioned supervision implementation
  • public runtime facades must consume shared runtime-owned supervisors and ceremony runners. They may not allocate private ownership roots as a convenience constructor.
  • service health must reflect degraded obligation progress explicitly. "Task exists" is not a sufficient health contract when required maintenance work is failing.

Observed

Observed code reads and presents authoritative state but does not own it. Use it for projections, UI rendering, harness reads, diagnostics, and reporting.

Observed code may submit typed commands to owner surfaces. It may not author semantic lifecycle or readiness truth. It may not repair ownership mistakes by mutating product state.

Observed/reactive code also may not synthesize canonical entity metadata from weaker signals. If a channel, invitation, or similar parity-critical entity requires canonical name/context materialization, one explicit owned path must materialize it end to end. Membership events, UI projections, or view-local fallbacks may enrich an already-materialized entity, but they may not create or repair the canonical entity shape.

Capability-Gated Authority

The ownership model builds on Aura's existing capability system. Parity-critical mutation and publication should be capability-gated.

Semantic lifecycle publication requires an appropriate capability. Readiness publication requires a coordinator-owned capability. Ownership transfer requires a transfer capability or sanctioned handoff token. Actor ingress that mutates owned state requires the actor's command boundary.

The goal is to make incorrect authority structurally hard to express. Code should not be able to publish semantic truth merely because it can call a helper.

The same fail-closed rule applies to authoritative signal reads in semantic owner code. Missing or unavailable authoritative state must surface as explicit failure or degraded state, not Default::default() business truth.

Usage Examples

When To Use Pure

Use Pure when the code interprets values rather than owning authority or async lifecycle.

#![allow(unused)]
fn main() {
pub fn reduce_membership(
    current: MembershipState,
    fact: MembershipFact,
) -> MembershipState {
    match fact {
        MembershipFact::Joined { member } => current.with_member(member),
        MembershipFact::Left { member } => current.without_member(member),
    }
}
}

This stays Pure because it consumes and returns values, owns no long-lived state, and does not publish lifecycle directly.

When To Use MoveOwned

Use MoveOwned when stale access must become invalid after handoff.

#![allow(unused)]
fn main() {
use aura_core::{
    issue_owner_token, OwnershipTransferCapability,
};

let capability = OwnershipTransferCapability::new("ownership:transfer");
let token = issue_owner_token(&capability, "invite-op-7", "channel:alpha");
let transfer = token.handoff("invite-coordinator");
}

The original token is consumed by handoff. Trying to act through the old owner is a compile-time error.

Typed ownership capabilities from the same wrapper family can also be issued onto Aura's existing Biscuit path without first down-converting them to raw CapabilityKey values via ownership_capability_token_request_for(...). Lower layers should not expose parallel raw ownership-capability request helpers once a typed wrapper family exists.

When To Use Capability Tokens

Use capability wrappers whenever parity-critical code needs authority to author semantic truth.

#![allow(unused)]
fn main() {
use aura_core::{
    issue_operation_context, AuthorizedProgressPublication,
    AuthorizedReadinessPublication, LifecyclePublicationCapability,
    OperationContextCapability, OperationTimeoutBudget, OwnedShutdownToken,
    OwnerEpoch, PublicationSequence, ReadinessPublicationCapability,
    TraceContext,
};

let context_capability = OperationContextCapability::new("semantic:context");
let lifecycle_capability = LifecyclePublicationCapability::new("semantic:lifecycle");

let mut ctx = issue_operation_context(
    &context_capability,
    "send_message",
    "send_message-7",
    OwnerEpoch::new(0),
    PublicationSequence::new(0),
    OperationTimeoutBudget::deferred_local_policy(),
    OwnedShutdownToken::detached(),
    TraceContext::detached(),
);

let update: AuthorizedProgressPublication<_, _, _, _> =
    ctx.publish_progress(&lifecycle_capability, "waiting");
let terminal = ctx
    .begin_terminal::<(), &'static str>(&lifecycle_capability)
    .fail("timeout");
}

Context minting and publication both require capability-shaped inputs. Random helper code cannot fabricate owner context or publish lifecycle by accident.

The same rule applies to readiness and actor-ingress mutation. Higher layers should prefer AuthorizedReadinessPublication<T> and AuthorizedActorIngressMutation<T> over raw capability arguments when they need to move parity-critical authority across API boundaries.

When To Use ActorOwned

Use ActorOwned when one live task must own mutable async state and terminal responsibility.

#![allow(unused)]
fn main() {
struct ChannelInviteCoordinator {
    pending: HashMap<InviteId, InviteState>,
    rx: mpsc::Receiver<InviteCommand>,
}

impl ChannelInviteCoordinator {
    async fn run(mut self) {
        while let Some(command) = self.rx.recv().await {
            self.apply(command).await;
        }
    }
}
}

There is one live owner of pending. Mutation happens through typed ingress. Owner drop is a lifecycle event that must be surfaced explicitly.

Selection Heuristics

Choose Pure first if the logic can be expressed as value-in/value-out. Choose MoveOwned when the hard problem is exclusive authority or stale-holder invalidation. Choose ActorOwned when the hard problem is long-lived mutable async state under one live owner. Choose Observed only for read-only presentation or diagnostics.

Anti-patterns to avoid:

  • a shell callback publishing semantic success (should be Observed)
  • shared mutable Arc<Mutex<_>> state spread across tasks (should be ActorOwned)
  • rewriting an owner field in place after delegation (should be MoveOwned)
  • reducers that call time/network/storage directly (no longer Pure)
  • reactive/view code inventing channel names from raw ids or membership events instead of consuming an owned canonical materialization path
  • runtime or workflow code mining pending invitations, optimistic sketches, or cross-context routing cache entries to repair a missing authoritative binding/context after handoff

Contributor Requirement

New or materially changed parity-critical modules must declare their ownership category in the crate ARCHITECTURE.md.

That declaration must name:

  • the ownership category (Pure, MoveOwned, ActorOwned, or Observed)
  • the authoritative owner for terminal lifecycle if the surface is async
  • the capability-gated mutation/publication points
  • the local timeout/backoff owner if deadlines or retries are involved

The ownership declaration is part of the change, not optional follow-up documentation.

Terminality

Every parity-critical operation must have typed terminal behavior. Direct boundaries use Result<T, E>. Long-running operations use typed lifecycle phases: Submitted, zero or more intermediate phases, then Succeeded, Failed(E), or Cancelled.

Runtime-owned bridge APIs must follow the same rule. If a public runtime call can distinguish no progress, started, already running, processed, degraded, or mutated, it must return a typed outcome instead of collapsing those states into Result<(), E>.

Every submitted operation must reach a terminal state. Terminal states may not regress. Owner drop must publish failure or cancellation explicitly.

Terminality alone is not strong enough. Aura also requires owner-internal liveness: a legal owner may not contain unbounded internal work that can keep an operation in OperationState::Submitting forever. If the owner can hang indefinitely while still technically being the "right" owner, the architecture is incomplete.

Timeout-triggered returns do not relax this rule. A timeout may fail an operation, but it may not silently convert an ambiguous owner-internal state into nominal success. Typed NoProgress or Degraded outcomes are preferable to unit success when runtime-owned work can observe those distinctions.

Semantic Owner Protocol

Parity-critical semantic operations must follow one protocol from submission to terminal publication.

  1. A frontend or harness may create a local submission record.
  2. If app/runtime workflow ownership is required, the frontend/harness must hand ownership off immediately before the first awaited workflow step.
  3. After handoff, only the canonical owner may publish non-local lifecycle.
  4. The canonical owner must publish a terminal state before any best-effort repair, warm-up, or post-success reconciliation that is allowed to fail.
  5. Best-effort follow-up work must never be required for the operation to stop being OperationState::Submitting.

This forbids the bug shape where:

  • the callback is the "temporary" owner
  • the app workflow is the "real" owner
  • both are structurally legal
  • but the callback keeps local OperationState::Submitting state alive while the workflow has already reached terminal publication or is blocked in best-effort work

The handoff boundary must therefore be before awaited workflow execution, not after it.

Macro declaration rule:

  • #[aura_macros::semantic_owner(owner = "...", terminal = "...", category = "move_owned")] is the sanctioned declaration surface for move-owned semantic workflow boundaries
  • semantic owners must also declare:
    • postcondition = "..." for the authoritative state guaranteed by success
    • depends_on = "a,b,..." for prerequisite readiness edges
    • child_ops = "a,b,..." for sanctioned semantically required child work
    • proof = Type whenever success is only valid if a typed postcondition witness has been established
  • #[aura_macros::capability_boundary(category = "capability_gated", capability = "...")] is the sanctioned declaration surface for capability-bearing mint/publication helpers
  • #[aura_macros::actor_owned(owner = "...", domain = "...", gate = "...", command = Type, capacity = N, category = "actor_owned")] is the sanctioned declaration surface for long-lived actor-owned async domains
  • #[aura_macros::ownership_lifecycle(initial = "...", ordered = "...", terminals = "...")] is the sanctioned declaration surface for small parity-critical lifecycle enums
  • #[aura_macros::authoritative_source(kind = "...")] is the sanctioned declaration surface for helpers that mint or read authoritative semantic truth. Valid kinds are runtime, signal, app_core, and proof_issuer.
  • #[aura_macros::strong_reference(domain = "...")] is the sanctioned declaration surface for canonical strong-reference carriers. Valid domains are channel, invitation, ceremony, home, and home_scope.
  • #[aura_macros::weak_identifier(domain = "...")] is the sanctioned declaration surface for weak identifier carriers that must not be upgraded into strong bindings without an explicit owner path

Reactive Contract

Parity-critical reactive consumers must rely on one explicit subscription contract.

  • subscription to an unregistered signal is a typed failure, not an empty or inert stream
  • there is no implicit registration wait for parity-critical consumers
  • if a subscriber lags behind the broadcast buffer, the handler logs the lag and resumes from a newer snapshot
  • parity-critical owners may not infer replay or lossless history from the reactive layer unless an explicit replay contract exists

This means reactive delivery is a transport for authoritative snapshots, not an alternate owner of semantic truth. Owner code must tolerate "newer snapshot after lag" semantics without silently treating a missed update as "no change."

Enforcement rule:

  • ownership declarations, strong-reference markers, and authoritative-source markers should be enforced first by proc-macro validation, Rust-native lints, and compile-fail tests
  • shell scripts should remain only for integration checks or governance rules that are not realistically provable in types or Rust-native syntax analysis

Reactive/view consumers also may not fabricate canonical entities from partial facts. For example, a membership fact may update membership for a known channel, but it may not create a channel with channel_id.to_string() as a fallback name. Canonical entity materialization must come from one owned path that already carries the authoritative metadata.

Owner Body Rules

Once a function is designated as a semantic owner, its body is constrained more strictly than ordinary async code.

Allowed:

  • bounded awaits through approved timeout-budget helpers
  • retries through approved retry-policy helpers
  • publication through capability-gated lifecycle/readiness helpers
  • explicit handoff to another sanctioned owner

Forbidden:

  • raw open-ended .await on runtime/effect calls
  • awaiting best-effort network or transport side effects before terminal publication
  • retaining a frontend-local owner while awaiting an app-owned workflow
  • detached work that still owns terminal responsibility
  • direct spawn from a semantic owner except through an explicitly declared child-operation surface
  • silently discarding parity-critical results or errors
  • ad hoc local retries, sleeps, or polling loops

If a semantic owner needs long-lived convergence, that convergence must be owned by a dedicated ActorOwned coordinator and expressed as typed readiness or typed terminal lifecycle, not as an unbounded await hidden inside a helper.

Typed Success Proofs

Declared postconditions are not documentation-only. For parity-critical operation families, Succeeded should be tied to an opaque typed proof surface whenever the authoritative postcondition is stronger than "the function returned successfully".

The required pattern is:

  • capability-gated code performs the authoritative mutation, readiness check, or materialization step
  • that sanctioned helper mints an opaque proof witness for the declared postcondition, such as a channel-membership-ready proof
  • the semantic owner publishes terminal success by consuming that proof through the canonical success path

This is intentionally different from a capability token:

  • a capability answers who is allowed to act
  • a proof answers what has become true

Proofs must therefore be minted by capability-gated code, but the proof itself must not be the authority token.

The canonical direction is:

  • #[aura_macros::semantic_owner(..., postcondition = "...", proof = Type)]
  • owner success goes through publish_success_with(proof) or the equivalent canonical proof-bearing success helper
  • plain publish_phase(Succeeded) is forbidden for proof-bound owners

Proof constructors stay private. External code must not be able to forge a proof witness, and the compile-fail suites should prove that boundary.

Best-Effort Separation

Aura distinguishes terminally required work from best-effort work.

Terminally required work:

  • determines whether the operation is Succeeded, Failed, or Cancelled
  • may block terminal publication only through bounded waits owned by the canonical semantic owner

Best-effort work:

  • may improve projection quality, connectivity, warming, discovery, or local convenience
  • must run only after terminal publication, or under a different owner with its own explicit lifecycle
  • must not prevent the submitted operation from leaving OperationState::Submitting
  • must not directly publish parity-critical lifecycle or readiness
  • must not directly perform parity-critical mutation such as committing facts, materializing authoritative state, registering required ownership, or other work that a later parity-critical operation depends on
  • must not use the best_effort_* naming surface unless they actually obey the best-effort contract above. Aura treats that prefix as a reserved ownership boundary and lints it accordingly even when the helper forgot to add an explicit #[best_effort_boundary].

If a step mutates authoritative state required by a later semantic operation, it is not best-effort. It belongs either:

  • inside the canonical semantic owner before Succeeded, or
  • inside a distinct owned child operation with its own explicit lifecycle and dependency edge

This rule is stronger than "use timeouts". A bounded best-effort step is still architecturally wrong if it owns the primary operation's terminal state.

Correct-By-Construction Requirements

For parity-critical operation families, "correct by construction" means:

  • submission uses one canonical typed owner wrapper
  • owner handoff uses one canonical consumed transfer API
  • terminal publication uses one capability-gated API family
  • success implies one declared authoritative postcondition
  • proof-bound success consumes one opaque typed proof minted by sanctioned capability-gated code
  • bounded awaits use one approved timeout-budget helper family
  • retries use one approved retry-policy helper family
  • best-effort work uses one explicit helper family that cannot publish or delay primary terminal state
  • semantic owners do not spawn except through declared child-operation APIs
  • parity-critical results are not ignored or downgraded to logging-only paths

Enforcement Ratchet

Aura treats ownership enforcement as a ratchet, not a static checklist.

The desired order of strength is:

  1. private constructors and opaque types
  2. capability-gated APIs
  3. canonical owner wrappers/macros
  4. AST-backed lints
  5. compile-fail tests
  6. invariant and concurrency tests
  7. thin CI shell wrappers that call those stronger checks

Shell scripts remain useful as workflow glue, but they are the weakest layer. If a policy matters for parity-critical correctness, the long-term goal is to encode it in types, macros, or AST-backed analysis rather than rely on grep.

Required Invariants For Parity-Critical Operations

Every parity-critical operation family should have invariant tests for all of the following:

  • owner drop forces Failed or Cancelled
  • terminal state cannot regress
  • stale owner or stale handle cannot advance state
  • canonical owner publishes terminal state within a bounded budget
  • Succeeded implies the semantic owner's declared postcondition holds
  • proof-bound Succeeded consumes the correct typed witness for that declared postcondition
  • best-effort failure cannot block terminal publication
  • no later parity-critical operation can depend on hidden best-effort work to make a successful operation "actually true"
  • frontend-local submission state cannot mask authoritative terminal state after handoff
  • older authoritative instances cannot overwrite newer local submissions

Frontend Handoff Rule

Layer 7 frontends are primarily Observed, but they are allowed to own a very small local submission window. That window is subject to a strict rule:

  • if the frontend owns terminal publication, it must settle locally
  • if the app/runtime owns terminal publication, the frontend must relinquish local ownership before awaiting the app/runtime workflow

There is no supported middle state where the frontend keeps a local submitting record "just in case" while the canonical workflow runs elsewhere.

Time And Ownership

Timeouts and backoffs are part of ownership, not incidental implementation detail.

For every parity-critical async path, the architecture must identify:

  • who owns the deadline
  • who owns retry policy
  • whether the wait is terminally required or best-effort
  • what happens when the budget is exhausted

Wall-clock time remains a local choice in Aura's time model, but timeout policy ownership is not a local choice. A path that can wait forever without a declared owner and terminal consequence is an ownership violation.

Layer Guidance

Layer 1 (aura-core) defines the shared ownership vocabulary: primitives, typed lifecycle helpers, and capability boundaries.

Layer 2 (domain crates) defaults to Pure. Use MoveOwned only when transfer semantics are part of the domain itself. Domain crates should not silently grow runtime-style async ownership.

Layer 3 (implementation crates) defaults to stateless handlers. Avoid long-lived mutable ownership except for narrow adapter internals.

Layer 4 (orchestration) uses MoveOwned for delegation, session ownership, and handoff. Use ActorOwned for long-lived orchestration coordinators only.

Layer 5 (feature crates) should have single-owner semantic lifecycle. Wrappers, views, and shells must not co-author stronger semantics than canonical workflows.

Layer 6 (runtime) is the primary ActorOwned layer. Runtime services, caches, maintenance loops, and supervisors should be actor-owned. Ownership transfer still uses MoveOwned surfaces.

Layer 7 (interface) is primarily Observed. Frontends may provide command ingress mechanics but do not own parity-critical semantic truth.

Layer 8 (testing) may simulate actors and capabilities. Parity-critical lanes must observe and submit through the same ownership boundaries as production code.

Workspace Ownership Inventory

This inventory covers every Rust crate under crates/. It is the workspace-level baseline. Detailed per-module inventories belong in crate ARCHITECTURE.md files.

Layer Summary

  • Layer 1 (aura-core) is primarily Pure and defines the canonical ActorOwned, MoveOwned, and capability-gated vocabulary. It does not own long-lived mutable runtime state.
  • Layer 2 crates stay primarily Pure. They may expose MoveOwned records when transfer semantics are part of the domain model, but they do not grow runtime-style actor ownership.
  • Layer 3 crates stay Pure and infrastructural. They implement handlers and composition without becoming semantic owners of parity-critical lifecycle.
  • Layer 4 crates commonly mix ActorOwned coordinators and MoveOwned transport/protocol surfaces. Coordination publication and ingress remain capability-gated.
  • Layer 5 crates mix Pure, MoveOwned, and narrow ActorOwned protocol coordinators. Ceremony, invitation, recovery, sync, and rendezvous flows use typed handles, typed terminal states, and coordinator-owned publication.
  • Layer 6 splits strictly:
    • aura-agent is the production ActorOwned runtime and the only sanctioned production structured-concurrency path
    • aura-app is primarily Pure plus MoveOwned and owns authoritative semantic lifecycle/readiness publication for shared semantic flows
    • aura-simulator is ActorOwned for simulation coordination and Observed for test-facing exports
  • Layer 7 crates are strict consumers:
    • aura-terminal and aura-web are Observed with narrow ingress/bridge ownership only
    • aura-ui is Observed
    • frontend-local parity-critical lifecycle ownership is not allowed outside the sanctioned local-terminal/handoff boundary
  • Layer 8 crates are primarily Observed. Test-only actor helpers are allowed where they mirror production owner boundaries rather than inventing a separate semantic model.

Each crate ARCHITECTURE.md must classify its parity-critical modules, identify actor-owned domains, name consumed move-owned surfaces, and list the capability-gated mutation/publication points it exposes.

Enforcement

The ownership model is enforced in layers.

Types and private constructors provide the first line of defense. Capability-gated mutation and publication APIs form the second layer. Canonical owner wrappers and macros provide the third layer. AST-backed checks, compile-fail tests, invariant tests, and then thin scripts/check/*.sh / just ci-* wrappers complete CI enforcement.

Enforcement split:

  • Types, private constructors, sealed traits, and consumed ownership wrappers are the primary defense.
  • Ownership and lifecycle declaration macros in aura-macros force explicit boundary classification.
  • trybuild compile-fail suites in aura-core and aura-app prove that forbidden ownership/publication patterns do not compile.
  • Parity-critical APIs must require the strongest available typed input. Once authoritative context exists, raw-id helper calls, resolve_* re-resolution, and *_or_fallback repair paths are ownership violations.
  • Rust-native lint binaries in aura-macros provide syntax-level fences for:
    • proof-bound semantic owners using plain Succeeded publication instead of proof-bearing success
    • semantic owners publishing success and then launching detached continuation
    • semantic owners spawning outside explicit child-operation ownership
    • silent discard of parity-critical results and errors
    • best-effort boundaries performing direct parity-critical mutation or publication
    • helpers named best_effort_* being treated as real best-effort boundaries rather than advisory comments
    • raw spawn / raw task-handle escape hatches
    • frontend semantic handoff bypasses
    • authoritative-ref downgrade via raw-id re-resolution inside authoritative-only workflow slices
    • raw timeout/time-domain usage in protected modules
  • Thin shell wrappers under scripts/check/ remain only as CI glue or where the invariant is inherently integration-level rather than compile-time.
  • Governance-only checks stay clearly separate from code-correctness enforcement:
    • just ci-ownership-categories
    • just ci-harness-actor-vs-move-ownership
    • Aura toolkit/xtask user-flow guidance sync check
  • Runtime/integration checks remain appropriate for properties such as runtime shutdown ordering and instrumentation schema discipline because those are orchestration-level invariants, not just API-shape rules.

Primary enforcement belongs in typed ownership primitives and proc-macro declarations. For the frontend/harness stack this means:

  • HarnessUiOperationHandle and UiOperationHandle are constructor/accessor surfaces, not public-field records
  • readiness refresh helpers stay private to aura-app::workflows
  • the local-terminal and workflow-handoff owner wrappers are the sanctioned frontend submission windows
  • UiTaskOwner and WebTaskOwner are the only sanctioned frontend task-ownership surfaces

Shell checks such as semantic-owner bounded-await and frontend/harness boundary wrappers are secondary fences for narrow escape hatches, not the source of truth for semantic correctness.

Phase-6 rollup rule:

  1. extend typed ownership primitives or proc-macro declarations first
  2. add or update compile-fail coverage for the newly-closed misuse shape
  3. place syntax-owned enforcement in aura-macros lint binaries and run it through just lint-arch-syntax or just ci-ownership-policy
  4. keep shell backstops only for integration-governance checks that cannot be proved in types or Rust-native syntax analysis
  5. delete the legacy helper, compatibility wrapper, migration shim, or stale test that the stronger contract replaced in the same milestone

Do not keep dormant fallback modules, compatibility constructors, or "temporary" legacy upgrade paths after the strong contract exists. If a rule can be enforced by types, macros, compile-fail suites, or Rust-native lints, that enforcement is primary and the shell layer stays thin.

For proof-bearing postconditions specifically, the desired enforcement order is:

  1. private proof constructors
  2. capability-gated proof minting helpers
  3. #[semantic_owner(..., proof = Type)]
  4. compile-fail tests proving proofs cannot be forged or minted from the wrong module
  5. AST-backed linting that rejects plain Succeeded publication in proof-bound owners
  6. invariant tests proving the proof-minting helper is semantically honest

Scripts alone are not sufficient. The API must make the wrong pattern hard or impossible first.

Service-Family Enforcement Map

For new Establish, Move, or Hold surfaces, keep the enforcement split explicit:

  • type-enforced:
    • runtime-local policy stays inside runtime-owned services
    • strongest typed reference continues after authoritative context exists
    • capability-gated mutation/publication boundaries
  • proc-macro declaration-enforced:
    • #[service_surface(...)]
    • #[actor_owned(...)], #[semantic_owner(...)], or the narrower ownership declaration that matches the boundary
  • compile-fail-enforced:
    • service-surface misuse and private-constructor misuse in trybuild suites
    • stale-owner / wrong-capability / wrong-layer misuse when the API shape can reject it
  • lint-enforced:
    • aura-macros ownership and architecture lint binaries run through just ci-ownership-policy and just lint-arch-syntax
  • script-enforced:
    • thin repo-wide integration/governance gates such as just ci-service-surface-policy, just ci-service-registry-ownership, just ci-harness-ownership-policy, and just ci-adaptive-privacy-tuning

The default contributor path for service-family boundary work is:

  1. just lint-arch-syntax
  2. just ci-ownership-policy
  3. just check-arch
  4. just ci-adaptive-privacy-tuning when the change affects adaptive privacy policy, simulator evidence, or control-plane parity artifacts

Same-Change Checklist

Any new service-family composition or service-surface type must include, in the same change:

  1. the declaration macros for the boundary and owner category
  2. typed/runtime-local state placement that keeps shared truth separate from local policy
  3. compile-fail or invariant coverage for the strongest misuse shape the API can reject
  4. lint or script wiring for the remaining syntax/integration boundary
  5. doc updates in the affected crate ARCHITECTURE.md and any authoritative guide that owns the contract
  6. removal of the superseded helper, adapter, allowlist, compatibility branch, or migration note

If item 6 cannot be completed immediately, record the owner and explicit removal condition in the active migration inventory rather than leaving a silent compatibility path behind.

Review Checklist

When adding or modifying a parity-critical path, ask these questions:

  • What category is this module or subsystem?
  • Who is the single live owner of mutable async state?
  • How is authority transferred?
  • What capability authorizes mutation or publication?
  • What is the typed terminal success/failure contract?
  • What authoritative postcondition does Succeeded actually guarantee?
  • Does success require a typed proof witness, and where is that proof minted?
  • Where does local submission ownership end and canonical workflow ownership begin?
  • Once authoritative context exists, which strong typed reference carries it through the rest of the flow?
  • Could any later helper silently downgrade from that strong reference back to raw-id lookup, fallback, or re-resolution?
  • Which awaits are terminally required, and which are best-effort only?
  • Is any later parity-critical step relying on hidden best-effort follow-up?
  • What bounded budget owns each required wait and retry? If these answers are unclear, the design is not complete enough.

Network Anonymity

This document defines Aura's network privacy and network anonymity model. It specifies the route-layer cryptographic objects, bootstrap re-entry records, hop processing rules, and reply-block semantics used by adaptive privacy routing.

This document complements Transport and Information Flow, Rendezvous Architecture, Relational Contexts, and Social Architecture. Those documents define adjacent-peer transport, context-scoped discovery, relational trust facts, and social provisioning. This document defines the anonymous path layer that sits above those surfaces.

1. Scope

Aura provides two distinct network protection layers:

  1. Link encryption protects one adjacent transport hop.
  2. Path encryption protects the anonymous multi-hop route object carried across several adjacent hops.

Link encryption and path encryption solve different problems. Link encryption hides packet contents from the local network and from passive observers on one transport edge. Path encryption hides deeper route structure and reply-path structure from intermediate forwarding hops.

This document does not define application payload encryption. Application and context semantics remain protected by the existing Aura context and channel model.

2. Direct-Channel Baseline

Aura already uses adjacent-peer secure channels for direct transport. The current baseline is the Noise IKpsk2 25519 ChaChaPoly BLAKE2s pattern through NoiseEffects with the snow implementation.

The direct-channel baseline remains authoritative for adjacent-peer secure channels. Anonymous path routing does not replace that layer. Each adjacent hop on an anonymous route still runs over the existing link-protected channel model from Transport and Information Flow.

3. Network Privacy Goals

Aura's route layer has the following goals:

  • prevent an intermediate hop from reading deeper route state
  • prevent an intermediate hop from deriving the final destination from its local peel result alone
  • allow bounded stale-node re-entry without a singleton bootstrap service
  • keep bootstrap hints signed, expiring, replay-bounded, and scope-limited
  • keep reply paths typed and accountability-aware

Aura's route layer has the following non-goals:

  • defeat a global passive observer
  • create a globally enumerable neighborhood graph
  • create a canonical shared Web-of-Trust topology map

4. Route-Layer Construction

Aura adopts a route-layer construction based on Curve25519, Aura's centralized KDF, and ChaCha20-Poly1305.

The route-layer construction uses the following rules:

  1. Each anonymous path setup flow creates a fresh route identifier and fresh ephemeral route secret material.
  2. Each hop derives a forward hop key stream and a backward hop key stream from the route secret material through Aura's centralized KDF.
  3. Each hop encrypts or decrypts only its own layer with ChaCha20-Poly1305.
  4. Each hop receives enough authenticated metadata to identify the next processing action, but not enough to reconstruct deeper route state.

Aura uses SURB-like reply blocks as Aura-native typed objects with explicit expiry, scope, and accountability semantics.

4.1 Deployment Model

The service family model is always active. Establish, Move, and Hold are the normal service vocabulary for path creation, opaque movement, and custody.

LocalRoutingProfile::passthrough() is the pre-privacy routing baseline. It uses mixing depth 0, delay 0, cover rate 0, and path diversity 1. Hold remains active under passthrough because it is an availability service, not a routing-profile knob.

Production privacy uses encrypted path setup and encrypted MoveEnvelope processing with one fixed adaptive policy. Users do not tune this policy. Development and simulation may sweep policy constants, but production nodes use the evidence-backed constants shipped with the build.

The current fixed policy uses path-diversity floor 2, cover floor 2 packets per second, delay gain denominator 3, neighborhood hold retention window 120s, and retrieval-capability rotation beginning 10s before expiry.

Link encryption uses the adjacent-peer secure channel. It protects one hop and one transport edge.

Path encryption uses the route-layer construction from section 4. It protects the route object carried inside the link-protected channel. Intermediate hops remove one route layer and learn only the immediate forwarding decision and the local accountability material for that hop.

The two layers must remain distinct:

  • path-layer keys must not be reused as adjacent-peer channel keys
  • adjacent-peer channel state must not be treated as route-hop state
  • application or context semantic keys must not be treated as route-layer keys

6. Route-Layer Objects

The route layer uses the following typed objects:

  • BootstrapContactHint
  • NeighborhoodReentryHint
  • bounded bootstrap introduction records
  • EstablishedPath
  • MoveEnvelope
  • typed reply blocks

BootstrapContactHint records a remembered direct contact or prior provider that may help stale-node re-entry. It carries a scope, expiry, freshness data, and signed contact material. It does not represent canonical route truth.

NeighborhoodReentryHint records a board-published re-entry surface. It carries a neighborhood-scoped publication, expiry, replay bound, and signed route-surface material. It does not expose a globally enumerable topology map.

Bounded bootstrap introductions carry explicit introducer identity, introduced authority, scope, expiry, maximum remaining depth, and fan-out limits. They are trust and bootstrap evidence. They are not canonical shared route tiers.

EstablishedPath is the reusable route object consumed by Move. MoveEnvelope is the shared movement envelope family. Encrypted peel processing is the production route-layer behavior and preserves the accountable movement boundary.

7. Bootstrap and Re-Entry Surfaces

Aura supports stale-node re-entry through several overlapping bootstrap surfaces:

  1. remembered direct contacts and prior providers
  2. neighborhood discovery boards
  3. bounded Web-of-Trust bootstrap introductions
  4. rotating bootstrap relays or bridge providers

These surfaces are ordered inputs, not canonical shared route truth. Runtime selection may widen from one surface to the next when prior attempts fail or when freshness decays.

Aura explicitly rejects the following bootstrap designs:

  • a singleton bootstrap authority
  • a globally enumerable neighborhood adjacency map
  • a canonical shared friends-of-friends graph

8. Neighborhood Discovery Boards

Neighborhood discovery boards publish signed, expiring, scope-limited re-entry hints. A board publication must include:

  • the publishing authority
  • the scoped neighborhood or re-entry domain
  • an expiry time
  • a replay-bounded publication identifier
  • the advertised route-layer or move-surface public material

Board contents are advisory. Runtime caches may merge them. Runtime caches must not elevate them into canonical route truth. Runtime caches must not expose a stable global graph projection derived from board contents.

9. Bounded Bootstrap Introductions

Bootstrap introductions are Web-of-Trust evidence used for stale-node re-entry. Each introduction must include:

  • introducer authority
  • introduced authority
  • scope
  • expiry
  • maximum remaining depth
  • maximum fan-out

Introductions are valid only within their declared bounds. Runtime policy may consume them as discovery and permit input. Runtime policy must not publish a canonical shared introduction tier or transitive trust graph.

10. Hop Processing Rules

Every forwarding hop performs the following route-layer steps:

  1. authenticate and decrypt the local hop layer
  2. verify route identifier, hop position, expiry, and replay bound
  3. derive the local forward or backward hop key stream
  4. recover the next forwarding instruction or reply instruction
  5. emit local accountability state and continue on the adjacent secure channel

A hop may learn:

  • that it is on the route
  • the previous hop on the adjacent edge
  • the next hop on the adjacent edge
  • local replay and expiry state

A hop may not learn:

  • the full route
  • deeper hop keys
  • the final destination unless it is the exit hop
  • the full reply path unless it is processing its own reply layer

11. Typed Reply Blocks

Aura uses typed reply blocks for backward anonymous delivery. A reply block is an Aura-native object with:

  • scope
  • route binding
  • expiry
  • replay bound
  • backward hop material
  • accountability linkage

Reply blocks are not borrowed Tor SURB packets. They are typed Aura objects that integrate with Aura movement, accountability, and retrieval-capability rotation rules.

Reply blocks must remain distinct from:

  • application message payloads
  • adjacent-peer secure channel state
  • bootstrap trust records

11.1 Movement Scheduling

The runtime schedules protected movement through shared classes rather than separate transport families. Sync-blended traffic may wait for anti-entropy windows. Bounded-deadline replies carry accountability or control traffic with shorter deadlines. Synthetic cover fills the remaining cover floor.

Application traffic and sync-blended retrieval reduce the synthetic-cover gap. Accountability replies are measured separately in the current deployment model. They do not reduce the first-deployment synthetic cover floor.

12. Per-Boundary Leakage

Aura tracks privacy leakage by boundary. The route-layer design assumes leakage cannot be eliminated completely. The design instead constrains what each boundary can learn.

The main boundaries are:

  • external observer of adjacent link traffic
  • intermediate forwarding hop
  • compromised subset of route hops
  • stale-node bootstrap observer

An external observer may see timing, packet count, and adjacent link endpoints. An intermediate hop may see its adjacent predecessor and successor and its local peel result. A stale-node bootstrap observer may see limited board, introduction, or bridge use, but it must not recover a canonical shared topology map from that data alone.

13. Adversary Assumptions

Aura assumes the following adversary model:

  • local passive observers exist
  • some forwarding hops may be compromised
  • bootstrap boards and bridge providers may be observed
  • stale-node re-entry may happen after long offline gaps and physical movement

Aura does not assume a global passive adversary can be defeated. Aura does not assume that service relationships reveal nothing. Aura aims to reduce graph leakage and route leakage under partitioned socially rooted operation.

14. Construction Rationale

Aura keeps adjacent-peer Noise channels because they already fit the transport boundary and context model. Aura adds a route-layer construction because adjacent-peer channels alone do not hide deeper route structure from intermediate forwarding hops.

Aura chooses Curve25519, a centralized KDF surface, and ChaCha20-Poly1305 because the construction is simple, auditable, and fits Aura's typed route-layer needs. The route layer needs explicit forward and backward hop streams, typed replay bounds, and typed reply blocks. A compact Aura-native construction is easier to align with these requirements than importing a foreign packet format.

15. Required Implementation Boundaries

The implementation must satisfy the following boundaries:

  1. aura-effects owns hop crypto primitives
  2. shared record types for bootstrap hints and re-entry hints remain authoritative typed objects
  3. runtime-owned caches merge bootstrap records locally and expire them locally
  4. MoveEnvelope remains the shared accountable movement boundary
  5. transparent_onion remains a debug and simulation tool only

Production paths use encrypted peel processing. Transparent setup and header inspection objects remain quarantined behind the explicit transparent_onion feature surface and must fail closed in release production builds.

16. Summary

Aura uses existing Noise-based adjacent-peer channels for link encryption and a separate Aura-native route-layer construction for anonymous path encryption. Bootstrap and re-entry use overlapping signed and expiring surfaces instead of a singleton service. Reply blocks remain typed Aura objects. Runtime selection stays local and does not promote bootstrap provenance into canonical shared route truth.

Hello World Guide

This guide gets you running with Aura in 15 minutes. You will build a simple ping-pong protocol, deploy it locally, and interact with it using the CLI.

Setup

Aura uses Nix for reproducible builds. Install Nix with flakes support.

Enter the development environment:

nix develop

This command activates all required tools and dependencies. The environment includes Rust, development tools, and build scripts.

Build the project:

just build

The build compiles all Aura components and generates the CLI binary. This takes a few minutes on the first run.

Creating an Agent

Aura provides platform-specific builder presets for creating agents. The CLI preset is the simplest path for terminal applications.

#![allow(unused)]
fn main() {
use aura_agent::AgentBuilder;

// CLI preset - simplest path for terminal applications
let agent = AgentBuilder::cli()
    .data_dir("~/.aura")
    .testing_mode()
    .build()
    .await?;
}

The CLI preset provides sensible defaults for command-line tools. It uses file-based storage, real cryptographic operations, and TCP transport.

For custom environments that need explicit control over effect handlers, use AgentBuilder::custom() with typestate enforcement. This requires providing all five core effects (crypto, storage, time, random, console) before build() is available.

Platform-specific presets are available for iOS (AgentBuilder::ios()), Android (AgentBuilder::android()), and Web/WASM (AgentBuilder::web()). These require feature flags to enable. See Effects and Handlers Guide for detailed builder examples.

See Project Structure for details on the 8-layer architecture and effect handler organization.

Ownership Declaration Before You Add New Parity-Critical Code

Before adding a new parity-critical module or workflow, declare its ownership category in the crate ARCHITECTURE.md.

Use Pure for reducers, validators, and typed contracts. Use MoveOwned for handles, owner tokens, and ownership transfer. Use ActorOwned for long-lived mutable async state and coordinators. Use Observed for rendering, harness reads, and diagnostics.

Also declare which capability gates parity-critical mutation and publication, which module owns terminal lifecycle, and which timeout and backoff policy the owner consumes. If those points are not explicit, the new module is not ready to land.

Hello World Protocol

Create a simple ping-pong choreography. This protocol demonstrates basic message exchange between two devices.

#![allow(unused)]
fn main() {
use aura_macros::tell;
use aura_core::effects::{ConsoleEffects, NetworkEffects, TimeEffects};
use aura_core::time::PhysicalTime;
use serde::{Serialize, Deserialize};

/// Sealed supertrait for ping-pong effects
pub trait PingPongEffects: ConsoleEffects + NetworkEffects + TimeEffects {}
impl<T> PingPongEffects for T where T: ConsoleEffects + NetworkEffects + TimeEffects {}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Ping {
    pub message: String,
    pub timestamp: PhysicalTime,
}

#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Pong {
    pub response: String,
    pub timestamp: PhysicalTime,
}

tell! {
    #[namespace = "hello_world"]
    protocol HelloWorld {
        roles: Alice, Bob;

        Alice[guard_capability = "hello_world:send_ping", flow_cost = 10]
        -> Bob: SendPing(Ping);

        Bob[guard_capability = "hello_world:send_pong", flow_cost = 10, journal_facts = "pong_sent"]
        -> Alice: SendPong(Pong);
    }
}
}

The choreography defines a global protocol. Alice sends a ping to Bob. Bob responds with a pong. Guard capabilities control access and flow costs manage rate limiting. Outside the choreography DSL boundary, first-party Rust code should use typed capability families or capability_name! rather than hand-written capability strings.

Implement the Alice session:

#![allow(unused)]
fn main() {
pub async fn execute_alice_session<E: PingPongEffects>(
    effects: &E,
    ping_message: String,
    bob_device: aura_core::DeviceId,
) -> Result<Pong, HelloWorldError> {
    let ping = Ping {
        message: ping_message,
        timestamp: effects.current_timestamp().await,
    };

    let ping_bytes = serde_json::to_vec(&ping)?;
    effects.send_to_peer(bob_device.into(), ping_bytes).await?;

    let (peer_id, pong_bytes) = effects.receive().await?;
    let pong: Pong = serde_json::from_slice(&pong_bytes)?;

    Ok(pong)
}
}

Alice serializes the ping message and sends it to Bob. She then waits for Bob's response and deserializes the pong message. See Effect System for details on effect-based execution.

Local Deployment

Initialize a local Aura account:

just quickstart init

This command creates a 2-of-3 threshold account configuration. The account uses three virtual devices with a threshold of two signatures for operations.

Check account status:

just quickstart status

The status command shows account health, device connectivity, and threshold configuration. All virtual devices should show as connected.

Run quickstart smoke checks:

just quickstart smoke

This command runs a local end-to-end smoke flow (init, status, and threshold-signature checks) across multiple virtual devices.

CLI Interaction

The Aura CLI provides commands for account management and protocol testing. These commands demonstrate core functionality.

View account information:

aura status --verbose

This shows detailed account state including journal facts, capability sets, and trust relationships. The journal contains all distributed state updates.

Run a threshold signature test:

aura threshold-test --message "hello world" --threshold 2

The threshold test coordinates signature generation across virtual devices. Two devices must participate to create a valid signature.

View recent protocol activity:

aura journal-query --limit 10

This command shows recent journal entries created by protocol execution. Each entry represents a state change with cryptographic verification.

Testing Your Protocol

Create a test script for the hello world protocol:

#![allow(unused)]
fn main() {
use aura_macros::aura_test;
use aura_testkit::*;
use aura_agent::runtime::AuraEffectSystem;
use aura_agent::AgentConfig;

#[aura_test]
async fn test_hello_world_protocol() -> aura_core::AuraResult<()> {
    // Create test fixture with automatic tracing
    let fixture = create_test_fixture().await?;

    // Create deterministic test effect systems
    let alice_effects = AuraEffectSystem::simulation_for_named_test_with_salt(
        &AgentConfig::default(),
        "test_hello_world_protocol",
        0,
    )?;
    let bob_effects = AuraEffectSystem::simulation_for_named_test_with_salt(
        &AgentConfig::default(),
        "test_hello_world_protocol",
        1,
    )?;

    // Get device IDs for routing
    let alice_device = fixture.create_device_id();
    let bob_device = fixture.create_device_id();

    let ping_message = "Hello Bob!".to_string();

    // Run protocol sessions concurrently
    let (alice_result, bob_result) = tokio::join!(
        execute_alice_session(&alice_effects, ping_message.clone(), bob_device),
        execute_bob_session(&bob_effects, ping_message.clone())
    );

    assert!(alice_result.is_ok(), "Alice session failed");
    assert!(bob_result.is_ok(), "Bob session failed");

    let pong = alice_result?;
    assert!(pong.response.contains(&ping_message));

    Ok(())
}
}

This test creates deterministic, seeded effect systems for Alice and Bob using simulation_for_named_test_with_salt(...). The identity + salt pair makes failures reproducible. For comprehensive testing approaches, see Testing Guide.

Run the test:

cargo test test_hello_world_protocol

The test validates protocol correctness without requiring network infrastructure. Mock handlers provide deterministic behavior for testing.

Understanding System Invariants

System invariants are defined in System Architecture and Theoretical Model. Key invariants include Charge-Before-Send, CRDT Convergence, Context Isolation, and Secure Channel Lifecycle.

See Project Structure for traceability details. When developing, ensure your protocols respect these invariants to maintain system integrity.

Next Steps

You now have a working Aura development environment. The hello world protocol demonstrates basic choreographic programming concepts.

Continue with Effects and Handlers Guide to learn about effect systems, platform implementation, and handler patterns. Learn choreographic programming in Choreography Guide. For session type theory, see MPST and Choreography.

Explore testing and simulation in Testing Guide and Simulation Guide.

Effects and Handlers Guide

This guide covers how to work with Aura's algebraic effect system. Use it when you need to extend the system at its boundaries: adding handlers, implementing platform support, or creating new effect traits.

For the full effect system specification, see Effect System.

1. Code Location

A critical distinction guides where code belongs in the architecture.

Single-party operations go in aura-effects. These are stateless, context-free handlers that take input and produce output without maintaining state or coordinating with other handlers.

Examples:

  • sign(key, msg) -> Signature - one device, one cryptographic operation
  • store_chunk(id, data) -> Ok(()) - one device, one write
  • RealCryptoHandler - self-contained cryptographic operations

Multi-party coordination goes in aura-protocol. These orchestrate multiple handlers together with stateful, context-specific operations.

Examples:

  • execute_anti_entropy(...) - orchestrates sync across multiple parties
  • CrdtCoordinator - manages state of multiple CRDT handlers
  • GuardChain - coordinates authorization checks across sequential operations

If removing one effect handler requires changing the logic of how other handlers are called (not just removing calls), it belongs in Layer 4 as orchestration.

Decision Matrix

PatternCharacteristicsLocation
Single effect trait methodStateless, single operationaura-effects
Multiple effects/handlersStateful, multi-handleraura-protocol
Multi-party coordinationDistributed state, orchestrationaura-protocol
Domain types and semanticsPure logic, no handlersDomain crate
Complete reusable protocolEnd-to-end, no UIFeature crate
Handler/protocol assemblyRuntime compositionaura-agent
User-facing applicationHas main() entry pointaura-terminal

Boundary Questions

Stateless code goes in aura-effects. Stateful code goes in aura-protocol. Single-party code goes in aura-effects. Multi-party code goes in aura-protocol. Context-free code goes in aura-effects. Context-specific code goes in aura-protocol.

2. Effect Handler Pattern

Effect handlers are stateless. Each handler implements one or more effect traits from aura-core. It receives input, performs a single operation, and returns output. No state is maintained between calls.

Production handlers (like RealCryptoHandler) use real libraries. Mock handlers (like MockCryptoHandler in aura-testkit) use deterministic implementations for testing.

See Cryptographic Architecture for cryptographic handler requirements.

Implementing a Handler

Step 1: Define the trait in aura-core.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait MyEffects: Send + Sync {
    async fn my_operation(&self, input: Input) -> Result<Output, EffectError>;
}
}

Step 2: Implement the production handler in aura-effects.

#![allow(unused)]
fn main() {
pub struct RealMyHandler;

#[async_trait]
impl MyEffects for RealMyHandler {
    async fn my_operation(&self, input: Input) -> Result<Output, EffectError> {
        // Implementation using real libraries
    }
}
}

Step 3: Implement the mock handler in aura-testkit.

#![allow(unused)]
fn main() {
pub struct MockMyHandler {
    seed: u64,
}

#[async_trait]
impl MyEffects for MockMyHandler {
    async fn my_operation(&self, input: Input) -> Result<Output, EffectError> {
        // Deterministic implementation for testing
    }
}
}

Adding a Cryptographic Primitive

  1. Define the type in aura-core crypto module
  2. Implement aura-core traits for the type's semantics
  3. Add a single-operation handler in aura-effects that implements the primitive
  4. Use the handler in feature crates or protocols through the effect system

3. Platform Implementation

Use the AgentBuilder API to assemble the runtime with appropriate effect handlers for each platform.

Builder Strategies

StrategyUse CaseCompile-Time Safety
Platform presetStandard platforms (CLI, iOS, Android, Web)Configuration validation
Custom presetFull control over all effectsTypestate enforcement
Effect overridesPreset with specific customizationsMixed

Platform Presets

#![allow(unused)]
fn main() {
// CLI
let agent = AgentBuilder::cli()
    .data_dir("~/.aura")
    .build()
    .await?;

// iOS (requires --features ios)
let agent = AgentBuilder::ios()
    .app_group("group.com.example.aura")
    .keychain_access_group("com.example.aura")
    .build()
    .await?;

// Android (requires --features android)
let agent = AgentBuilder::android()
    .application_id("com.example.aura")
    .use_strongbox(true)
    .build()
    .await?;

// Web/WASM (requires --features web)
let agent = AgentBuilder::web()
    .storage_prefix("aura_")
    .build()
    .await?;
}

Custom Preset with Typestate

#![allow(unused)]
fn main() {
let agent = AgentBuilder::custom()
    .with_crypto(Arc::new(RealCryptoHandler::new()))
    .with_storage(Arc::new(FilesystemStorageHandler::new("~/.aura".into())))
    .with_time(Arc::new(PhysicalTimeHandler::new()))
    .with_random(Arc::new(RealRandomHandler::new()))
    .with_console(Arc::new(RealConsoleHandler::new()))
    .build()
    .await?;
}

All five required effects must be provided or the code won't compile.

Required Effects

EffectPurposeTrait
CryptoSigning, verification, encryptionCryptoEffects
StoragePersistent data storageStorageEffects
TimeWall-clock timestampsPhysicalTimeEffects
RandomCryptographically secure randomnessRandomEffects
ConsoleLogging and outputConsoleEffects

Optional Effects

EffectDefault Behavior
TransportEffectsTCP transport
LogicalClockEffectsDerived from storage
OrderClockEffectsDerived from random
ReactiveEffectsDefault reactive handler
JournalEffectsDerived from storage + crypto
BiometricEffectsFallback no-op handler

Platform Implementation Checklist

  • Identify platform-specific APIs for crypto, storage, time, random, console
  • Implement the five core effect traits
  • Create a preset builder (optional)
  • Add feature flags for platform-specific dependencies
  • Write integration tests using mock handlers
  • Document platform-specific security considerations
  • Consider transport requirements (WebSocket, BLE, etc.)

4. Testing Handlers

Test handlers using mock implementations from aura-testkit.

#![allow(unused)]
fn main() {
use aura_testkit::*;

#[aura_test]
async fn test_my_handler() -> aura_core::AuraResult<()> {
    let fixture = create_test_fixture().await?;

    // Use fixture.effects() to get mock effect system
    let result = my_operation(&fixture.effects()).await?;

    assert!(result.is_valid());
    Ok(())
}
}

Never use real system calls in tests such as SystemTime::now() or thread_rng(). Use deterministic seeds for reproducibility. Test both success and error paths.

See Testing Guide for comprehensive testing patterns.

5. Effect System Architecture

For deeper understanding of the effect system architecture, see:

Key Concepts

The effect system uses three layers:

  1. Foundation effects in aura-core cover crypto, storage, time, random, console, and transport.
  2. Infrastructure effects in aura-effects provide production handlers implementing foundation traits.
  3. Composite effects are built by composing foundation effects. For example, TreeEffects combines storage and crypto.

All impure operations (time, randomness, filesystem, network) must flow through effect traits. Direct calls break simulation determinism and WASM compatibility.

Run just check-arch to validate effect trait placement and layer boundaries.

Choreography Development Guide

This guide covers how to build distributed protocols using Aura's choreographic programming system. Use it when you need to coordinate multiple parties with session types, CRDTs, and multi-phase workflows.

For theoretical foundations, see MPST and Choreography. For operation categorization, see Operation Categories.

1. When to Use Choreography

Use choreographic protocols when:

  • Multiple parties must coordinate (threshold signing, consensus, sync)
  • Session guarantees matter (no deadlock, no message mismatch)
  • You need formal verification of protocol correctness

Do not use choreography for:

  • Single-party operations (use effect handlers)
  • Simple request-response (use direct transport)

2. Protocol Development Pipeline

This pipeline applies to all Layer 4/5 choreographies and all Category C ceremonies.

Phase 1: Classification and Facts

Operation categories (A, B, C) are defined in Operation Categories. The category determines coordination requirements and affects protocol design choices (local vs. CRDT vs. ceremony).

Define fact types with schema versioning:

#![allow(unused)]
fn main() {
use aura_macros::ceremony_facts;

#[ceremony_facts]
pub enum InvitationFact {
    CeremonyInitiated {
        ceremony_id: CeremonyId,
        agreement_mode: Option<AgreementMode>,
        trace_id: Option<String>,
        timestamp_ms: u64,
    },
    CeremonyCommitted {
        ceremony_id: CeremonyId,
        relationship_id: String,
        agreement_mode: Option<AgreementMode>,
        trace_id: Option<String>,
        timestamp_ms: u64,
    },
    CeremonyAborted {
        ceremony_id: CeremonyId,
        reason: String,
        trace_id: Option<String>,
        timestamp_ms: u64,
    },
}
}

The macro provides canonical ceremony_id() and ceremony_timestamp_ms() accessors.

Phase 2: Choreography Specification

Write the choreography in a .tell file:

#![allow(unused)]
fn main() {
use aura_macros::tell;

tell! {
    #[namespace = "secure_request"]
    protocol SecureRequest {
        roles: Client, Server;

        Client[guard_capability = "chat:message:send", flow_cost = 50]
        -> Server: SendRequest(RequestData);

        Server[guard_capability = "chat:message:send", flow_cost = 30, journal_facts = "response_sent"]
        -> Client: SendResponse(ResponseData);
    }
}
}

Annotation syntax: Role[guard_capability = "namespace:capability", flow_cost = N, journal_facts = "..."] -> Target: Message

guard_capability is the one sanctioned raw-string boundary for first-party capability names in choreography source. The macro parses these values into validated CapabilityNames and fails closed on invalid or legacy names. In Rust code outside .tell or tell! inputs, prefer typed capability families from the owning crate.

Canonical Telltale 10 Shape

Aura choreographies should converge on one source shape before theorem-pack rollout expands.

The canonical shape is:

  • one module ... exposing (...) header
  • one protocol ... = declaration
  • one compact roles ... declaration using scalar roles or role families like Devices[N]
  • message edges whose semantics live in the protocol structure itself: sequencing, fan-out, and choice at ...
  • sanctioned Aura edge metadata only where it still represents real policy or accounting at the source boundary: guard_capability flow_cost journal_facts leak parallel
  • no choreography-local ownership or migration scaffolding that should instead live in Telltale runtime admission, runtime transition artifacts, or Aura's Rust-side protocol/runtime glue

Use crates/aura-sync/src/protocols/ota_activation.tell and crates/aura-sync/src/protocols/device_epoch_rotation.tell as the current style references.

Legacy-Only Source Surfaces To Remove

The following source patterns are migration debt and should disappear from first-party .tell files:

  • link = "bundle=...|exports=...|imports=..." when it is only preserving historical migration lore rather than declaring a live runtime fragment bundle consumed by reconfiguration or ownership code
  • choreography comments whose only purpose is to preserve pre-Telltale runtime bundle/link ownership lore
  • journal_merge = true as a choreography-local escape hatch
  • leakage_budget = "..." tuple syntax
  • repeated phase comments that restate obvious linear protocol structure rather than documenting a real protocol invariant

These semantics should not remain embedded in source once the runtime and admission layers own them explicitly.

Aura Annotations That Remain Valid On The Clean Surface

These annotations remain valid on the canonical surface because they still map to real Aura admission, accounting, or observation behavior:

  • guard_capability
  • flow_cost
  • journal_facts
  • leak
  • parallel

Later theorem-pack phases may add Telltale-native theorem-pack declarations and requires clauses, but those are additive and should not bring back legacy bundle or migration hints.

Semantics That Must Move Out Of Source Over Time

The following concerns should move out of comments or source-local hints and into proper Telltale 10 DSL/runtime constructs or Aura runtime glue:

  • reconfiguration and runtime-upgrade bundle ownership
  • fragment/link transfer semantics
  • journal-merge behavior that is really runtime transition behavior
  • recovery or membership handoff lore expressed only in comments
  • admission/evidence requirements that should become theorem packs and runtime admission checks

2.1 Current Choreography Convergence Inventory

This inventory defines the target for each current .tell file while Aura converges on the canonical Telltale 10 shape.

First-Party Protocols

Protocol-Compat Fixtures

2.2 Theorem-Pack Taxonomy and Admission Policy

Aura should add theorem packs only where they drive a real runtime admission decision. They are not importance labels.

The first production protocols now using this boundary are aura.sync.ota_activation and aura.sync.device_epoch_rotation. Both declare the "AuraTransitionSafety" proof bundle, and Aura rejects launch before protocol start when the runtime lacks the theorem-pack capability surface needed for transition receipts, bridge admission, and reconfiguration safety.

Current Protocol Split

First-wave theorem-pack candidates:

  • aura.sync.ota_activation
  • aura.sync.device_epoch_rotation

Second-wave theorem-pack candidates:

  • aura.recovery.grant
  • aura.recovery.guardian_setup
  • aura.recovery.guardian_ceremony
  • aura.recovery.guardian_membership_change
  • aura.invitation.device_enrollment once the recovery/finalization path is explicitly theorem-pack-gated end to end

Later explicit evaluation target:

  • aura.consensus

Current theorem-pack-free protocols:

  • aura.session.coordination
  • aura.amp.transport
  • aura.dkg.ceremony
  • aura.authentication.guardian_auth_relational
  • aura.invitation.exchange
  • aura.invitation.guardian
  • aura.rendezvous.exchange
  • aura.rendezvous.relay
  • aura.sync.epoch_rotation

Current Aura Runtime Admission Surfaces

Aura already has a small runtime capability bridge and several concrete runtime consumers:

  • aura-core::effects::RuntimeCapabilityEffects defines the stable admission interface.
  • aura-effects::RuntimeCapabilityHandler maps startup runtime contracts into a capability inventory and public protocol-critical surfaces.
  • aura-protocol::admission maps protocol ids to current required runtime capability keys.
  • aura-agent::runtime::choreo_engine and aura-agent::runtime::choreography_adapter enforce required runtime capability admission before protocol launch.
  • aura-agent::runtime::contracts and aura-agent::runtime::vm_hardening already validate runtime capability requirements for determinism and profile policy.
  • aura-agent::runtime::services::threshold_signing and aura-agent::runtime::services::reconfiguration_manager already consume protocol-critical runtime capability state on concrete execution paths.

The currently published public protocol-critical surfaces are:

  • runtime_admission
  • theorem_pack_capabilities
  • ownership_capability
  • readiness_witness
  • authoritative_read
  • materialization_proof
  • canonical_handle
  • ownership_receipt
  • semantic_handoff
  • reconfiguration_transition

These are the only current Aura-owned runtime consumers theorem-pack mapping is allowed to target.

Initial Aura Theorem-Pack Taxonomy

Aura should keep the first taxonomy small:

  • "AuraTransitionSafety" used for OTA/runtime-upgrade/device-epoch flows that depend on transition safety, reconfiguration continuity, and bridge/admission correctness
  • "AuraAuthorityEvidence" used for recovery/finalization flows that depend on authoritative read, canonical materialization, and receipt/evidence-backed transitions
  • "AuraConsensusDeployment" reserved for a later explicit consensus decision; do not use until Aura has a concrete runtime consumer beyond the existing envelope capability checks

Pack-to-Capability Mapping

"AuraTransitionSafety":

  • reuse Telltale inventory keys: protocol_machine_envelope_adherence, protocol_machine_envelope_admission, protocol_envelope_bridge, reconfiguration_safety
  • current Aura runtime consumers: theorem_pack_capabilities, ownership_receipt, semantic_handoff, reconfiguration_transition
  • concrete admission checks: aura-agent::runtime::choreo_engine, aura-agent::runtime::choreography_adapter, aura-agent::runtime::services::reconfiguration_manager

"AuraAuthorityEvidence":

  • reuse Telltale inventory keys: protocol_machine_envelope_adherence, protocol_machine_envelope_admission, protocol_envelope_bridge
  • Aura-owned runtime capability names already backed by real consumers: authoritative_read, materialization_proof, canonical_handle
  • concrete admission checks: aura-agent::runtime::choreo_engine, aura-agent::runtime::contracts, recovery/guardian launch paths in aura-agent
  • current adopted production protocols: aura.recovery.grant, aura.recovery.guardian_setup, aura.recovery.guardian_ceremony
  • current explicit non-goals in the recovery/invitation area: aura.recovery.guardian_membership_change, aura.invitation.guardian, aura.invitation.device_enrollment

"AuraConsensusDeployment":

  • candidate Telltale inventory keys: consensus_envelope, atomic_broadcast_ordering, partial_synchrony_liveness
  • current status: deferred until Aura adds a runtime admission consumer beyond byzantine_envelope / determinism profile checks
  • revisit only when: the consensus choreography itself owns authoritative admission or evidence branching for those guarantees, and Aura can fail closed on those exact choreography-level requirements instead of the current runtime-local consensus-profile checks

Aura-specific capability names are only allowed when they already have a real runtime consumer. Today that means authoritative_read, materialization_proof, and canonical_handle are allowed. New Aura-specific theorem-pack keys should not be added until the corresponding runtime admission check exists.

Explicit Non-Goals

The following protocol groups remain theorem-pack-free until they acquire a concrete control-plane admission or evidence need:

  • rendezvous exchange and relayed rendezvous
  • plain invitation exchange
  • protocol-compat fixture choreographies

Symmetry is not a valid reason to add theorem packs to these flows.

Adaptive-Privacy Control Plane Only

Adaptive-privacy theorem-pack usage is reserved for control-plane protocols with real admission, evidence, transition, or simulation semantics. It must not be used for onion data-plane forwarding, per-hop Move envelopes, cover traffic, or ordinary transport forwarding.

The current adaptive-privacy Telltale control-plane inventory is:

  • "AnonymousPathEstablishProtocol"
  • "MoveReceiptReplyBlockProtocol"
  • "HoldDepositReplyBlockProtocol"
  • "HoldRetrievalReplyBlockProtocol"
  • "HoldAuditReplyBlockProtocol"

Bootstrap and stale-node re-entry remain runtime-local for now. They are still hint lookup and cache-refresh logic, not canonical multi-party admission/evidence protocols.

These adaptive-privacy control-plane protocols are intentionally theorem-pack-free today. Aura does not add a new theorem pack for them until a dedicated runtime-admission surface exists beyond ordinary protocol-machine admission and the current local control services are no longer the canonical executors.

Admission Boundary

Aura consumes theorem-pack metadata only through the generated CompositionManifest boundary:

  • generated manifests carry theorem-pack declarations
  • generated manifests carry required theorem-pack names
  • generated manifests carry the flattened required theorem-pack capability set

aura-protocol::admission is the single Aura-owned translation layer that maps required theorem packs onto concrete runtime-admission requirements. That layer must stay small and fail closed:

  • a required theorem pack with no matching generated declaration is rejected
  • a required theorem pack with no Aura admission policy is rejected
  • a theorem-pack declaration whose declared capability set drifts from Aura's supported taxonomy is rejected
  • runtime launch rejects missing required theorem-pack capability coverage before protocol execution begins

aura-agent::runtime::vm_host_bridge, aura-agent::runtime::choreo_engine, and aura-agent::runtime::choreography_adapter consume that resolved boundary; they do not invent parallel theorem-pack policy tables.

Acceptance Rule

A choreography may add theorem-pack requirements only when all of the following are true:

  • the protocol depends on protocol-critical authority, evidence, or transition semantics
  • the required pack maps to existing Telltale inventory keys or to an Aura-specific runtime capability with a real admission consumer
  • Aura runtime admission fails closed before launch when that support is absent
  • a negative test proves the missing-pack admission failure
  • a positive test proves the protocol still runs when the support is present

If any of those conditions is missing, keep the choreography theorem-pack-free.

Migration Rule

When converting a choreography, prefer:

  • deleting legacy source hints rather than renaming them
  • moving ownership or transition semantics into Rust/runtime admission when they are not true choreography structure
  • keeping theorem packs out until the protocol has a real runtime admission consumer
  • matching the source compactness of the OTA and device-epoch references

Select the narrowest TimeStamp domain for each time field. See Effect System for time domains.

Phase 3: Runtime Wiring

Create the protocol implementation:

#![allow(unused)]
fn main() {
use aura_agent::runtime::open_manifest_vm_session_admitted;

let (mut engine, handler, vm_sid) = open_manifest_vm_session_admitted(
    &my_protocol::COMPOSITION_MANIFEST,
    "Initiator",
    &my_protocol::global_type(),
    &my_protocol::local_types(),
    scheduler_signals,
).await?;

let status = engine.run_to_completion(vm_sid)?;
}

This wiring opens an admitted VM session from generated choreography metadata. The runtime source of truth is the composition manifest, not an ad hoc adapter. Register the service with the runtime and integrate it with the guard chain. Category C operations must follow the ceremony contract.

Production services should treat the admitted unit as a protocol fragment. If the manifest declares link bundles, each linked bundle becomes its own ownership unit. Runtime transfer must use ReconfigurationManager. Do not bypass fragment ownership through service-local state.

The runtime also derives execution mode from admitted policy. Cooperative protocols stay on the canonical VM path. Replay-deterministic and envelope-bounded protocols select the threaded path only through the admission and hardening surface. Service code should not construct ad hoc threaded runtimes.

Phase 4: Status and Testing

Implement CeremonyStatus for Category C or protocol-specific status views:

#![allow(unused)]
fn main() {
pub fn ceremony_status(facts: &[InvitationFact]) -> CeremonyStatus {
    // Reduce facts to current status
}
}

Definition of Done:

  • Operation category declared (A/B/C)
  • Facts defined with reducer and schema version
  • Choreography specified with roles/messages documented
  • Runtime wiring added (role runners + registration)
  • Fragment ownership uses manifest admission and runtime ownership APIs
  • delegate and link flows use ReconfigurationManager
  • Threaded or envelope-bounded execution uses admitted policy only
  • Category C uses ceremony runner and emits standard facts
  • Status output implemented
  • Shared-bus integration test added
  • Simulation test added
  • Choreography parity/replay tests added (Category C)

See crates/aura-consensus/src/protocol/ for canonical examples.

3. CRDT Integration

CRDTs handle state consistency in choreographic protocols. See Journal for CRDT theory.

CRDT Coordinator

Use CrdtCoordinator to manage CRDT state in protocols:

#![allow(unused)]
fn main() {
use aura_protocol::effects::crdt::CrdtCoordinator;

// State-based CRDT
let coordinator = CrdtCoordinator::with_cv_state(authority_id, initial_journal);

// Delta CRDT with compaction threshold
let coordinator = CrdtCoordinator::with_delta_threshold(authority_id, 100);

// Meet-semilattice for constraints
let coordinator = CrdtCoordinator::with_mv_state(authority_id, capability_set);
}

Protocol Integration

Protocols consume and return coordinators with updated state:

#![allow(unused)]
fn main() {
use aura_sync::choreography::anti_entropy::execute_as_requester;

let (result, updated_coordinator) = execute_anti_entropy(
    authority_id,
    config,
    is_requester,
    &effect_system,
    coordinator,
).await?;

let synchronized_state = updated_coordinator.cv_handler().get_state();
}

4. Protocol Composition

Complex applications require composing multiple protocols.

Sequential Composition

Chain protocols for multi-phase workflows:

#![allow(unused)]
fn main() {
pub async fn execute_authentication_flow(
    &self,
    target_device: aura_core::DeviceId,
) -> Result<AuthenticationResult, ProtocolError> {
    // Phase 1: Identity exchange
    let identity_result = self.execute_identity_exchange(target_device).await?;

    // Phase 2: Capability negotiation
    let capability_result = self.execute_capability_negotiation(
        target_device,
        &identity_result
    ).await?;

    // Phase 3: Session establishment
    let session_result = self.execute_session_establishment(
        target_device,
        &capability_result
    ).await?;

    Ok(AuthenticationResult {
        identity: identity_result,
        capabilities: capability_result,
        session: session_result,
    })
}
}

Each phase uses results from previous phases. Failed phases abort the entire workflow.

Parallel Composition

Independent protocols can execute concurrently using try_join_all.

#![allow(unused)]
fn main() {
pub async fn execute_distributed_computation(
    &self,
    worker_devices: Vec<aura_core::DeviceId>,
) -> Result<ComputationResult, ProtocolError> {
    // Launch parallel worker protocols
    let worker_futures = worker_devices.iter().map(|device| {
        self.execute_worker_protocol(*device)
    });

    // Wait for all workers with timeout
    let worker_results = tokio::time::timeout(
        self.config.worker_timeout,
        futures::future::try_join_all(worker_futures)
    ).await??;

    // Aggregate results
    self.aggregate_worker_results(worker_results).await
}
}

Worker futures launch in parallel and are joined with a timeout. Results are then aggregated into a single computation result.

Effect Program Composition

Protocols can also be composed through effect programs using a builder pattern.

#![allow(unused)]
fn main() {
let composed_protocol = Program::new()
    .ext(ValidateCapability {
        capability: "coordinate".into(),
        role: Coordinator
    })
    .then(anti_entropy_program)
    .then(threshold_ceremony_program)
    .ext(LogEvent {
        event: "protocols_complete".into()
    })
    .end();
}

The builder chains validation, protocol execution, and logging into a single composed program.

5. Error Handling and Resilience

Timeout and Retry

Implement timeout handling with exponential backoff.

#![allow(unused)]
fn main() {
pub async fn execute_with_resilience<T>(
    &self,
    protocol_fn: impl Fn() -> BoxFuture<'_, Result<T, ProtocolError>>,
    operation_name: &str,
) -> Result<T, ProtocolError> {
    let mut attempt = 0;

    while attempt < self.config.max_attempts {
        match tokio::time::timeout(
            self.config.operation_timeout,
            protocol_fn()
        ).await {
            Ok(Ok(result)) => return Ok(result),
            Ok(Err(e)) if !e.is_retryable() => return Err(e),
            _ => {
                // Exponential backoff with jitter
                let delay = self.config.base_delay * 2_u32.pow(attempt);
                tokio::time::sleep(self.add_jitter(delay)).await;
                attempt += 1;
            }
        }
    }

    Err(ProtocolError::MaxRetriesExceeded)
}
}

The function retries on transient errors with exponential backoff and jitter. Non-retryable errors fail immediately.

Compensation and Rollback

For multi-phase protocols, implement compensation for partial failures.

#![allow(unused)]
fn main() {
pub async fn execute_compensating_transaction(
    &self,
    operations: Vec<Operation>,
) -> Result<TransactionResult, TransactionError> {
    let mut completed = Vec::new();

    for operation in &operations {
        match self.execute_operation(operation).await {
            Ok(result) => {
                completed.push((operation.clone(), result));
            }
            Err(e) => {
                // Compensate in reverse order
                self.execute_compensation(&completed).await?;
                return Err(TransactionError::OperationFailed {
                    operation: operation.clone(),
                    cause: e,
                });
            }
        }
    }

    Ok(TransactionResult { completed })
}
}

On failure, completed operations are compensated in reverse order. This ensures partial state is cleaned up before the error is returned.

Circuit Breakers

Circuit breakers prevent cascading failures by tracking error rates.

#![allow(unused)]
fn main() {
pub enum CircuitState {
    Closed { failure_count: usize },
    Open { opened_at: Instant },
    HalfOpen { test_requests: usize },
}

pub async fn execute_with_circuit_breaker<T>(
    &self,
    protocol_fn: impl Fn() -> BoxFuture<'_, Result<T, ProtocolError>>,
) -> Result<T, ProtocolError> {
    let should_execute = match &*self.circuit_state.lock() {
        CircuitState::Closed { failure_count } =>
            *failure_count < self.config.failure_threshold,
        CircuitState::Open { opened_at } =>
            opened_at.elapsed() >= self.config.recovery_timeout,
        CircuitState::HalfOpen { test_requests } =>
            *test_requests < self.config.test_threshold,
    };

    if !should_execute {
        return Err(ProtocolError::CircuitBreakerOpen);
    }

    match protocol_fn().await {
        Ok(result) => {
            self.record_success();
            Ok(result)
        }
        Err(e) => {
            self.record_failure();
            Err(e)
        }
    }
}
}

The breaker transitions through closed, open, and half-open states based on failure thresholds and recovery timeouts.

6. Guard Chain Integration

The guard chain specification is defined in Authorization. See System Internals Guide for the three-phase implementation pattern.

When integrating guards into choreographies, use the annotation syntax on choreography messages. The annotations compile to guard chain commands that execute before transport sends:

  • guard_capability: Creates capability check before send
  • flow_cost: Charges flow budget
  • journal_facts: Records facts after successful send
  • leak: Records leakage budget charge

Snapshot builders must not treat declared choreography guard names as already granted. They evaluate typed candidate sets against the current Biscuit/policy frontier and publish only the admitted frontier into the GuardSnapshot.

7. Domain Service Pattern

Domain crates define stateless handlers. The agent layer wraps them with services.

Domain Handler

#![allow(unused)]
fn main() {
// In domain crate (e.g., aura-chat/src/service.rs)
pub struct ChatHandler;

impl ChatHandler {
    pub async fn send_message<E>(
        &self,
        effects: &E,
        channel_id: ChannelId,
        content: String,
    ) -> Result<MessageId>
    where
        E: StorageEffects + RandomEffects + PhysicalTimeEffects
    {
        let message_id = effects.random_uuid().await;
        // ... domain logic
        Ok(message_id)
    }
}
}

Agent Service Wrapper

#![allow(unused)]
fn main() {
// In aura-agent/src/handlers/chat_service.rs
pub struct ChatService {
    handler: ChatHandler,
    effects: Arc<RwLock<AuraEffectSystem>>,
}

impl ChatService {
    pub async fn send_message(
        &self,
        channel_id: ChannelId,
        content: String,
    ) -> AgentResult<MessageId> {
        let effects = self.effects.read().await;
        self.handler.send_message(&*effects, channel_id, content)
            .await
            .map_err(Into::into)
    }
}
}

This keeps the domain crate pure without Tokio-specific locking or runtime coupling. It is testable with mock effects and consistent across crates.

8. Testing Choreographies

Unit Testing Guard Logic

#![allow(unused)]
fn main() {
#[test]
fn test_cap_guard_denies_unauthorized() {
    let snapshot = GuardSnapshot {
        capabilities: vec![],
        flow_budget: FlowBudget { limit: 100, spent: 0, epoch: 0 },
        ..Default::default()
    };
    let result = CapGuard::evaluate(&snapshot, &SendRequest::default());
    assert!(result.is_err());
}
}

Integration Testing Protocols

#![allow(unused)]
fn main() {
#[aura_test]
async fn test_sync_protocol() -> aura_core::AuraResult<()> {
    let fixture = create_test_fixture().await?;

    let coordinator = CrdtCoordinator::with_cv_state(
        fixture.authority_id(),
        fixture.initial_journal(),
    );

    let (result, _) = execute_anti_entropy(
        fixture.authority_id(),
        SyncConfig::default(),
        true, // is_requester
        &fixture.effects(),
        coordinator,
    ).await?;

    assert!(result.is_success());
    Ok(())
}
}

Simulation Testing

See Simulation Guide for fault injection and adversarial testing.

Testing Guide

This guide covers how to write tests for Aura protocols using the testing infrastructure. It includes unit testing, integration testing, property-based testing, conformance testing, and runtime harness validation.

For infrastructure details, see Test Infrastructure Reference. For the deterministic shared-flow design rules, see User Flow Harness.

1. Core Philosophy

Aura tests follow four principles. Tests use effect traits and never call direct impure functions. Tests run actual protocol logic through real handlers. Tests produce reproducible results through deterministic configuration. Tests validate both happy paths and error conditions.

Parity-critical ownership work also requires compile-fail coverage for ownership and capability boundaries enforced in types. This includes forbidden capability construction paths such as CapabilityId::from("...") or invalid capability_name!(...) literals. It also includes invariant tests for owner drop, stale-handle rejection, and terminality. Timeout and backoff tests should prove typed timeout failure, remaining-budget propagation, and bounded retries.

Run the relevant ownership just ci-* policy checks alongside crate tests. Run just lint-arch-syntax when changing capability parsing boundaries, typed capability-family usage, or choreography capability admission rules.

Harness Policy

The runtime harness is the primary end-to-end validation lane. Default harness runs exercise the real Aura runtime with real TUI and web frontends. The goal is to catch integration failures in the actual product, not just prove a model.

The harness has two distinct responsibilities. The shared semantic lane executes parity-critical shared flows through the shared semantic command plane. It waits on typed handles, readiness facts, runtime events, quiescence, and authoritative projections. This is the primary lane for debugging production code paths.

The frontend-conformance lane validates renderer-specific control wiring, DOM structure, PTY key mappings, and shell-level integration. It may use renderer-specific mechanics intentionally. It must not be the primary execution substrate for shared scenarios.

Quint and other verification tools generate models, traces, and invariants. They are not a replacement for real frontends.

aura-app owns the shared semantic scenario, command-plane, and UI contracts. aura-harness consumes those contracts and submits shared semantic commands to real frontends. aura-simulator is a separate alternate runtime substrate.

User-facing docs and harness guidance must not point readers at scratch-note or ephemeral local-output paths. Describe outputs in terms of the stable harness artifact bundle, scenario reports, and configured run outputs rather than repo-local scratch directories.

Use this lane matrix when selecting harness mode.

LaneBackendCommand
Local deterministicmockjust harness-run -- --config configs/harness/local-loopback.toml --scenario scenarios/harness/real-runtime-mixed-startup-smoke.toml
Patchbay relay realismpatchbayjust harness-run -- --config configs/harness/local-loopback.toml --scenario scenarios/harness/real-runtime-mixed-startup-smoke.toml --network-backend patchbay
Patchbay-vm relay realismpatchbay-vmjust harness-run -- --config configs/harness/local-loopback.toml --scenario scenarios/harness/real-runtime-mixed-startup-smoke.toml --network-backend patchbay-vm
BrowserPlaywrightjust harness-run-browser scenarios/harness/semantic-observation-browser-smoke.toml

All shared flows should use typed scenario primitives, typed semantic command submission, and structured snapshot and readiness waits.

Native TUI harness IPC is part of that shared compatibility surface now. In explicit harness mode the command socket and semantic snapshot mirrors are scoped under AURA_HARNESS_INSTANCE_TRANSIENT_ROOT, and command submission must authenticate with the per-run AURA_HARNESS_RUN_TOKEN. Setting AURA_TUI_COMMAND_SOCKET, AURA_TUI_UI_STATE_SOCKET, or AURA_TUI_UI_STATE_FILE outside explicit harness mode must stay inert or fail closed; those env vars are not a production backdoor.

Shared-semantic preflight is intentionally stricter than generic backend startup. A run config that includes SSH instances does not automatically qualify for the shared semantic lane. Until a backend implements the shared semantic contract, SSH remains diagnostic-only for harness purposes. Shared-semantic scenarios must fail closed before execution.

For SSH-backed diagnostic runs, remote artifact capture now has two explicit modes. When ssh_dry_run = true, the harness records a simulated sync summary only. When ssh_dry_run = false, the harness copies logs/ from the instance's remote_workdir back into the local artifact bundle under remote/<instance-id>/logs/ using scp, then records the copied file manifest and checksums in the per-instance sync summary plus remote_artifact_sync.json. Use require_remote_artifact_sync = true when the run should fail closed if that SSH artifact copy does not complete.

aura-app::ui_contract is the canonical module for shared flow support. It defines SharedFlowId, SHARED_FLOW_SUPPORT, SHARED_FLOW_SCENARIO_COVERAGE, UiSnapshot, compare_ui_snapshots_for_parity, OperationInstanceId, and RuntimeEventSnapshot. The root file is a facade; parity metadata, harness/browser bridge metadata, and shared-flow support tables may live in dedicated ui_contract/* modules, but the canonical public contract stays aura-app::ui_contract. Use semantic readiness and state assertions before using fallback text matching.

Direct usage of SystemTime::now(), thread_rng(), File::open(), or Uuid::new_v4() is forbidden. These operations must flow through effect traits.

Shared UX Contract and Determinism

The shared UX contract is defined in User Interface. The aura-app::ui_contract module is the canonical authority for parity-critical UI identity, readiness semantics, and typed observation payloads. The shared semantic scenario contract remains aura-app::scenario_contract. Its root may delegate contract families such as submission, actions, expectations, and values into scenario_contract/* modules without changing the public harness contract.

Shared scenarios must submit typed semantic commands through the frontend bridge. They must not use raw PTY keys, raw selector clicks, raw label matching, or incidental focus stepping as primary mechanics. Frontend-specific UI I/O belongs in frontend-conformance coverage rather than the main shared semantic lane. Unsupported semantic commands must fail closed and diagnostically.

Command submission must enter the frontend through its real update and event path. It must not use render-coupled polling or ad hoc harness shims.

Contacts friend management is part of that shared UX contract. The canonical relationship states are contact, pending_outbound, pending_inbound, and friend, and they must be projected from runtime-owned relational facts rather than shell-local heuristics. The shared contract owns the parity-critical contacts controls for send friend request, accept friend request, decline friend request, and remove friend, and both TUI and web tests should assert those actions through the same semantic surface. Harness and frontend-conformance coverage may verify renderer wiring around those controls, but the lifecycle itself stays anchored to Scenario 13 and the shared semantic command/observation path.

Shared Semantic Ownership Model

Parity-critical shared semantic flows must use one explicit ownership category. Do not mix categories casually inside the same flow. The four ownership categories (Pure, MoveOwned, ActorOwned, Observed) are defined in Ownership Model.

aura-app owns authoritative semantic operation coordination and typed lifecycle and error publication. aura-agent owns long-lived runtime and service actors and other actor-owned async state. aura-terminal and aura-web submit commands and observe lifecycle but do not own terminal semantic truth. aura-harness consumes typed handles, readiness, and projections but does not mutate semantic lifecycle directly.

Terminal convenience modals stay in that observed-only category. Opening or editing a local TUI modal such as the contact invitation sheet may prefill or reshape local display state, but it must not become an alternate semantic ingress path or carry authoritative receiver ownership. Tests should prove the real semantic boundary remains the typed dispatch command and upstream aura-app workflow submission path rather than modal-local state.

Clipboard copy affordances in TUI modals follow the same rule. Copy buttons and local clipboard helpers are convenience-only observed behavior, not shared semantic evidence. Headless CI and rustdoc builds must not rely on the host system clipboard being available. When a test or harness run needs to assert copied content, use AURA_CLIPBOARD_MODE=file_only together with AURA_CLIPBOARD_FILE and treat that capture file as diagnostic output rather than as proof of semantic success.

If a migrated parity-critical flow needs both actor and move semantics, the split must stay explicit. The actor owns mutable lifecycle state. Move-owned handles and tokens define which caller may advance or transfer it. If that split is not explicit, the flow is not considered correct by construction.

Parity-Critical Observation

UiSnapshot and render-convergence data are authoritative. Observation surfaces must be side-effect free. Recovery and retries must be explicit and separate from observation.

Browser ui_state remains observation-only and must not perform implicit navigation or state recovery. Explicit recovery goes through recover_ui_state and readStructuredUiStateWithNavigationRecovery(...). DOM and text fallback paths are diagnostics only and must not become success-path observation behavior.

Browser semantic observation must fail closed when the published snapshot is unavailable. It must not silently repair by reading a live controller or model snapshot behind the harness bridge. Channel-binding responses must either carry authoritative context materialization or fail explicitly. Selected ids or labels alone are not semantic bindings.

Channel list item ids and selected-channel snapshot ids must stay keyed by canonical channel ids when the runtime projection already provides them. Harness and browser code should not round-trip through display labels on those paths. Diagnostic tool and query surfaces should say diagnostic_* at the API boundary when they are derived from screen or DOM capture rather than authoritative semantic state. Onboarding must publish through the same semantic snapshot path as the rest of the UI.

Placeholder IDs, override-backed exports, and heuristic success or event synthesis are not acceptable correctness paths.

Parity-Critical Waits and Assertions

Waits must bind to declared readiness, event, or quiescence conditions. They may also bind to typed operation handles or strictly newer authoritative projections when the shared contract defines them.

When a runtime bridge surface exposes typed lifecycle such as DiscoveryTriggerOutcome, CeremonyProcessingOutcome, or an explicit mutation outcome, tests should assert those variants directly. Do not treat a unit success result as sufficient proof of progress. Executor-side follow-on waits should carry typed submission evidence from the issued receipt into the declared contract barriers. Do not keep a second harness-local convergence graph.

Projection-based semantic waits may resume across bounded browser or runtime restarts only by clearing stale freshness baselines and re-entering typed snapshot observation. Runtime-event, toast, and exact operation-state waits still fail closed across restarts. Semantic issue success must come from typed command receipts and authoritative runtime facts, not from visible homes, modal closure, message appearance, selected-list state, or a frontend-local submitting phase.

Shared semantic harness core should decode typed ToolPayload and bridge structs directly. Keep raw serde_json::Value plumbing at outer CLI and browser adapters only. Raw sleeps, redraw polling, DOM scraping, and fallback text matching are diagnostics only.

Scenario-language text assertions must keep the same split. message_contains means authoritative UiSnapshot.messages. diagnostic_screen_contains is frontend-conformance-only rendered text. Harness mode may change instrumentation and render stability, but it must not change business-flow semantics.

Ownership Test Expectations

When a change introduces or modifies a parity-critical ownership boundary, the test plan should include compile-fail tests for private constructors, capability misuse, or stale move-owned handles where the boundary is type-enforced. It should include invariant tests proving owner drop reaches explicit failure or cancellation. It should include invariant tests proving terminal lifecycle does not regress on the same logical instance.

It should also include invariant tests proving observed layers do not author semantic lifecycle. Include tests proving frontend-local submission yields immediately to the app-owned workflow owner after handoff. Include timeout and backoff tests proving local wall-clock policy only changes budget and diagnostics, not semantic success or failure rules. Run the relevant ownership and time just ci-* policy checks in addition to crate tests.

For shared semantic workflow changes, aura-app::workflows is the authoritative publication owner. aura-terminal, aura-web, and aura-harness must not retain a parallel terminal publication path after handoff. Review and test plans should name the terminal owner explicitly and treat frontend layers as submit and observe boundaries.

Use physical time for local deadline and backoff policy. Do not use wall-clock timeouts as the primary proof of distributed completion or ordering.

Failure Analysis

Prefer canonical action, event, and state traces along with structured timeout diagnostics. Treat final text or screenshot inspection as supporting evidence, not the primary oracle. Replay bundles should compare typed tool-response payload meaning, not just a binary success-versus-error shape.

Ownership Cleanup Discipline

Every shared UX or harness contract hardening change should remove obsolete compatibility code, stale allowlist entries, and transitional comments in the same milestone or the next explicit cleanup pass. Prefer extending typed governance in cargo run -p aura-harness --bin aura-harness --quiet -- governance ... over adding standalone shell policy logic.

Each parity-critical ownership change must include explicit cleanup work for the abstraction it replaces. Do not treat the ownership model as additive.

For every migrated flow, delete actor wrappers around purely local or value transitions that should stay Pure. Delete shared mutable ownership state where a MoveOwned handoff or owner-token surface is the correct model. Delete detached callback or task ownership for state that should instead live under one ActorOwned coordinator.

If a change leaves one of those old abstractions in place, record it as explicit ownership cleanup debt with the owning module and removal milestone. Do not hide it behind temporary ambient lifecycle helpers, duplicate readiness emitters, or shell-local terminal state.

The authoritative written update map for these surfaces now lives in Aura's toolkit/xtask user-flow guidance sync check and is enforced by just ci-user-flow-policy. Ownership-model policy for the shared semantic lane is enforced through the final CI entrypoints just ci-ownership-policy, just ci-harness-ownership-policy, and just ci-user-flow-policy.

Testing and Enforcement Split

Prefer trybuild compile-fail coverage when the misuse is fundamentally an API-shape or visibility violation. Prefer Rust-native lint binaries in aura-macros when the misuse is a syntax-level boundary or naming and flow-shape rule. Keep shell scripts for repo-wide governance, integration topology, or end-to-end harness policy that cannot realistically be proved at compile time. When a stronger contract lands, remove the superseded legacy helper, compatibility branch, migration shim, or stale regression fixture rather than leaving both paths active.

The authoritative frontend matrix for converted shared scenarios comes from scenarios/harness_inventory.toml and is enforced by just ci-harness-matrix-inventory. Allowlisted harness-mode hooks must carry explicit owner, justification, and design-note references enforced by Aura's toolkit/xtask user-flow policy guardrails. The diff-aware user-flow policy lane must tolerate empty local diff sets so just ci-user-flow-policy fails on real policy drift rather than environment-specific diff resolution.

Changes to the browser harness bridge request, response, or observation surface must update both crates/aura-web/ARCHITECTURE.md and this guide so compatibility expectations stay explicit.

For service-family work, keep the test/evidence split concrete:

  • type/API and proc-macro boundaries first
  • trybuild compile-fail tests for misuse that should not compile
  • aura-macros lint binaries for syntax-owned rules
  • thin shell scripts only for integration-wide or artifact-governance checks

The default contributor verification path for this class of change is:

  1. just lint-arch-syntax
  2. just ci-ownership-policy
  3. just check-arch
  4. just ci-adaptive-privacy-tuning if the change touches adaptive privacy policy constants, simulator evidence, or telltale-backed control-plane parity

When a new Establish, Move, or Hold surface lands, the same change should also add:

  • the service-surface declaration macros
  • compile-fail or invariant coverage for the strongest rejectable misuse
  • any required aura-macros lint coverage or thin script glue
  • the updated crate ARCHITECTURE.md and authoritative docs
  • removal of the superseded compatibility helper or explicit inventorying of the deferred cleanup

Browser Compatibility Surface

The browser compatibility surface includes the explicit stage_runtime_identity bootstrap handoff entrypoint plus the page-owned semantic submission queue (window.__AURA_DRIVER_SEMANTIC_ENQUEUE__). The browser also publishes page-owned semantic submit readiness metadata. This includes whether the enqueue surface is installed (enqueue_ready), the active vs ready generation boundary, controller presence, current shell phase, and any in-flight bootstrap transition detail. Driver startup and recovery waits bind to product-owned bootstrap and rebinding state instead of stale driver-local probes.

The bootstrap staging and handoff promise is completion-based. Callers may treat it as confirmation that the owned bootstrap or rebootstrap transition finished, not merely that the request was queued. For generation-changing bootstrap flows, that completion means the new page-owned shell generation has published its semantic snapshot through the canonical publication path. The browser diagnostics window.__AURA_UI_ACTIVE_GENERATION__ and window.__AURA_UI_READY_GENERATION__ reflect the active vs ready generation boundary. Render heartbeat remains the separate browser render-convergence signal.

Browser bootstrap storage is explicit. Preserved-profile recovery correctness depends on the typed selected runtime identity, pending bootstrap metadata, and browser-local AccountConfig metadata remaining distinct. The next generation must recover the canonical runtime bootstrap path without falling back to browser-local semantic repair. Channel-returning bridge responses now distinguish weak selected-channel ids from authoritative channel bindings. A payload that lacks context is not a binding.

Browser bootstrap broker credentials are intentionally not query-string parameters. Tests may stage the broker URL through controlled bootstrap setup, but bearer and invitation-retrieval tokens must use session-scoped browser storage or header-bearing runtime configuration. Harnesses should assert this through the browser storage/bridge contract rather than by inspecting URL parameters.

Browser harness failures surface explicit publication-state diagnostics through window.__AURA_UI_PUBLICATION_STATE__, window.__AURA_RENDER_HEARTBEAT_PUBLICATION_STATE__, and the page-owned semantic submit publication surface. Those globals are diagnostic-only and do not replace the authoritative UiSnapshot and RenderHeartbeat payloads. They are the canonical observed source for browser bootstrap and rebinding state. Driver-owned restart_page_session is infrastructure recovery only. Semantic command submission and runtime-identity staging must wait on or fail from the page-owned publication contract rather than replaying work through a restarted browser session.

submitSemanticCommand follows that rule directly. After bounded same-page recovery it must fail closed instead of replaying the semantic request through a fresh browser session. The browser publication owner classifies diagnostics by typed publication status, binding mode, and reliability before serializing them to page globals. Compatibility-sensitive waits keep one canonical publication path instead of ad hoc string assembly.

Browser semantic navigation follows the same separation. Page-owned navigation helpers such as navigate_screen and settings-section opening may publish the target UiSnapshot before the browser finishes painting the new screen. Harness navigation success must therefore wait for both the target semantic screen and the matching post-render RenderHeartbeat or equivalent render-convergence proof before treating the control activation as complete. DOM selectors remain diagnostic corroboration only; they must not replace the semantic-plus-render contract.

Browser-owned semantic snapshot publication should flow through one helper aligned with UiController::publish_ui_snapshot. Browser-owned maintenance polling should share one bounded helper for sleep, cancellation, and pause reporting so those paths stay uniform and clearly non-semantic. Parity exceptions must remain typed metadata in aura-app::ui_contract with a reason code, scope, affected surface, and authoritative doc reference.

Browser-owned async account/bootstrap flows must also fail closed on shell-state publication. If a Dioxus signal write collides with an unmounting or busy component, the browser shell may retry on the next browser tick for the active generation, but it must not silently drop the state transition.

Browser harness mode now has an authenticated runtime bootstrap rule as well. When the wasm agent runtime is launched under the browser harness, the authenticated query parameters __aura_harness_instance and __aura_harness_token are part of the canonical harness-mode contract. The browser shell and runtime must agree on that authenticated handoff before taking any harness-only invitation or device-enrollment relaxation path, and browser build or cache reuse must preserve the web,harness feature set that installs that bridge.

Shared-Flow Coverage Anchors

The canonical shared-flow coverage anchors for the current parity-critical user flows are listed below.

  • real-runtime-mixed-startup-smoke.toml for startup, onboarding, and shared neighborhood navigation
  • scenario13-mixed-contact-channel-message-e2e.toml for the shared chat, contacts, invitation, home creation, channel join, and message-send flow
  • scenario12-mixed-device-enrollment-removal-e2e.toml for device add and remove
  • shared-notifications-and-authority.toml and shared-settings-parity.toml for the remaining shared settings, authority, and navigation flows
  • amp-transition-normal-shared.toml, amp-transition-delayed-witness-shared.toml, amp-transition-conflict-subtractive-shared.toml, amp-transition-emergency-shared.toml, and amp-transition-negative-shared.toml for shared AMP transition observation coverage

The current aura-app split keeps those anchors unchanged while moving the authoritative flow owners into more specific modules. Shared-flow source-area metadata should point at the owner modules that now carry those flows: workflows/context/neighborhood.rs for neighborhood/home creation, workflows/invitation/{create,accept,readiness}.rs for contacts and invitation acceptance, and workflows/messaging/{channel_refs,channels,send}.rs for chat navigation, join, and message-send paths. The aura-app::ui_contract facade remains the canonical export surface for that coverage metadata.

Scenario 12 has an additional browser parity rule now. The shared semantic snapshot does not fabricate a selected row for ListId::Devices; current-device markers and removable-device targeting remain separate concepts. Browser harness submission for remove_selected_device must therefore fall back to the authoritative removable device from settings state when the snapshot has no explicit list selection, and the canonical mixed-runtime anchor remains scenario12-mixed-device-enrollment-removal-e2e.toml.

Scenario 13 has an additional mixed-runtime receive contract now. On the current TUI/browser path, authoritative inbound shared-channel messages may surface as sealed placeholders rather than plaintext payloads. Harness assertions for scenario13-mixed-contact-channel-message-e2e.toml should treat the [sealed: prefix as the canonical browser/TUI parity expectation for those receives instead of requiring renderer-local plaintext recovery.

Harness-mode timing exceptions remain narrowly allowlisted. The current shared allowlist includes the browser maintenance cadence plus the runtime and workflow instrumentation hooks that feed observed-shell timing helpers; those branches may tune observation cadence only and must not change business-flow semantics.

Shared pending-invitation acceptance has an additional invariant now: SemanticOperationKind::AcceptPendingChannelInvitation entry points must not strand the authoritative semantic lifecycle at SemanticOperationPhase::WorkflowDispatched when the shared browser/TUI flow fails before the owned accept path settles. If an early error escapes the owned path, the wrapper or *_with_instance entry point must synthesize the terminal failure publication for the same operation instance before returning to the shell or harness. Scenario 13 remains the canonical mixed-runtime anchor for that browser shared-channel receive parity.

Note-to-self is a real AMP channel provisioned at account bootstrap, not a display-only entry. It appears as a first-class channel backed by the runtime from first use, with its own context, deterministic channel ID, and standard message delivery. Channel creation parity coverage must not treat "has at least one contact" as a prerequisite for opening chat creation. TUI and web shells should expose the same semantic create-channel path when the only available participant is self, and scenario coverage should keep that path distinct from pairwise or group-member invitation flows.

The notifications shared-flow anchor remains navigation-only. Parity coverage for notifications navigation requires the TUI and web shells to expose the same semantic screen transition and detail-view contract, but notification empty-state copy is informational only and must not introduce parity-critical invitation or recovery actions outside the canonical shared workflows.

AMP channel transition frontend coverage uses the same semantic observation lane. Transition state, live successor and finalization state, conflict evidence, emergency quarantine, cryptoshred status, and suspect exclusion must be asserted through RuntimeFact::AmpChannelTransitionUpdated entries in UiSnapshot.runtime_events plus shared notification list ids.

Web and TUI frontends may render local affordances for emergency alarm, quarantine approval, cryptoshred approval, conflict evidence, and finalization status, but the controls and operation ids must come from aura-app::ui_contract. Tests must not infer AMP send or receive authority from local message-ratchet state or frontend-specific text. Destructive cryptoshred affordances must surface an explicit confirmation label and the loss of pre-emergency readability.

Native invitation and device-enrollment exports have an additional transport contract now. In non-wasm runs, sender_hint is a transport hint and must use the canonical tcp://host:port form rather than websocket-style ws:// or wss:// URLs. LAN integration and harness assertions should treat that field as a native direct-transport hint, not as a browser transport endpoint. When the runtime has both a stored rendezvous descriptor and a LAN-discovered descriptor for the same peer, invitation seeding should prefer the discovered descriptor if it adds a TcpDirect transport hint that the stored descriptor lacks so native shared-flow tests continue to exercise the direct LAN path.

Shared Semantic Ownership Inventory

Use this as the authoritative ownership map for the shared semantic stack. If code does not match this table, treat it as ownership cleanup debt rather than as an acceptable alternate pattern.

SubsystemCrate / locusOwnershipAuthoritative ownerMay mutateMay observe
Semantic command / handle contractaura-app::ui_contract, aura-app::scenario_contractPure + MoveOwnedaura-app contract surfacesaura-app contract and workflow modulesaura-terminal, aura-web, aura-harness
Semantic operation lifecycleaura-app::workflows::*MoveOwnedauthoritative workflow coordinatorworkflow and coordinator modules in aura-appfrontend render crates, harness
Channel / invitation / delivery readinessaura-app::workflows::*ActorOwnedsingle-owner readiness coordinatorcoordinator modules and sanctioned hooksshell, subscription, render, harness
Runtime-facing async service stateaura-agent::runtime::*, aura-agent::handlers::*ActorOwnedruntime service actoractor and sanctioned commandsaura-app, frontends, harness
TUI command ingressaura-terminal::tui::harness_state, update loopActorOwned ingress + Observed renderingTUI update and event loopingress and update-loop code onlyshell render, harness
TUI shell / callbacks / subscriptionsaura-terminal::tui::screens, callbacksObserveddownstream of authoritative statelocal UI state onlyharness, rendering
Browser harness bridgeaura-web::harness_bridgeActorOwned bridge + Observed publicationbrowser bridge modulebridge module onlyPlaywright, harness, render
Harness executor / wait modelaura-harness::executor, backend::*Observed + orchestration ActorOwnedharness coordinatorharness orchestration state onlyscenario authors, CI
Ownership transfer / stale-owner invalidationoperation handles, owner tokensMoveOwnedcurrent token holdersanctioned transfer APIs onlyprojections, render, diagnostics

The required split is that actor-owned subsystems own long-lived mutable async state and lifecycle. Move-owned surfaces own exclusive right-to-act and ownership transfer. Observed surfaces render, wait, and diagnose without authoring semantic truth.

Do not use this table to justify ambient shared ownership. If a subsystem needs both actor and move semantics, the actor owns mutable lifecycle state while the move-owned handle or token defines who may advance or transfer it.

Reactive Subscription Policy

Subscribing before registration must fail with ReactiveError::SignalNotFound. Tests must not treat an empty stream as equivalent to "signal not registered." Lagging subscribers are allowed to miss intermediate updates. Assertions should target eventual newer snapshots, not lossless delivery.

TUI-local semantic submission is limited to the sanctioned local-terminal and workflow-handoff owner wrappers. Browser bridge concurrency is limited to WebTaskOwner and does not own parity-critical lifecycle. Playwright stages browser runtime identity through the explicit bridge entrypoint before rebootstrap and submits semantic commands through the page-owned semantic queue.

Authoritative readiness refresh remains private to aura-app::workflows and is compile-fail tested in both default and signals configurations.

Required Ownership Invariants

Ownership-model migrations are not complete until the following test classes exist for the affected parity-critical surface. Include compile-fail guards for private constructors, wrong-capability issuance, and stale-owner misuse where the boundary is enforced in types. Include dynamic invariant tests proving owner drop reaches explicit terminal failure or cancellation. Include dynamic invariant tests proving terminal states do not regress on the same logical operation instance.

Include handle and instance tests proving stale handles do not match or advance the wrong operation instance after transfer or replacement. Include concurrency tests for actor-owned coordinators where lost updates or multiple live owners are plausible. Include timeout and backoff invariant tests proving typed timeout failure, remaining-budget propagation, bounded attempts, and local-choice scaling.

If a flow changes ownership model or timeout policy and these test classes do not move with it, treat the migration as incomplete.

Release and Update Matrix Expectations

OTA and module release and update validation must follow the same semantic-lane contract as other parity-critical shared flows. The OTA contract requirements are defined in Distributed Maintenance Architecture. These include typed command and control surfaces, scoped activation lifecycle, and rollback semantics. Each release row in UX Flow Coverage Report must map to those typed lifecycle surfaces.

Frontend-conformance coverage may validate release-screen wiring, but it does not satisfy OTA or module lifecycle validation on its own.

2. The #[aura_test] Macro

The macro provides async test setup with tracing and timeout.

#![allow(unused)]
fn main() {
use aura_macros::aura_test;
use aura_testkit::*;

#[aura_test]
async fn test_basic_operation() -> aura_core::AuraResult<()> {
    let fixture = create_test_fixture().await?;
    let result = some_operation(&fixture).await?;
    assert!(result.is_valid());
    Ok(())
}
}

The macro wraps the test body with tracing initialization and a 30-second timeout. Create fixtures explicitly rather than relying on ambient test state.

3. Test Fixtures

Fixtures provide consistent test environments with deterministic configuration.

Creating Fixtures

#![allow(unused)]
fn main() {
use aura_testkit::infrastructure::harness::TestFixture;

let fixture = TestFixture::new().await?;
let device_id = fixture.device_id();
let context = fixture.context();
}

The TestFixture provides a pre-configured environment with deterministic identifiers and effect handlers.

Custom Configuration

#![allow(unused)]
fn main() {
use aura_testkit::infrastructure::harness::{TestFixture, TestConfig};

let config = TestConfig {
    name: "threshold_test".to_string(),
    deterministic_time: true,
    capture_effects: true,
    timeout: Some(Duration::from_secs(60)),
};
let fixture = TestFixture::with_config(config).await?;
}

Custom configuration controls deterministic time, effect capture, and per-test timeouts. Name each configuration to aid failure identification.

Deterministic Identifiers

#![allow(unused)]
fn main() {
use aura_core::types::identifiers::AuthorityId;

let auth1 = AuthorityId::from_entropy([1u8; 32]);
let auth2 = AuthorityId::from_entropy([2u8; 32]);
}

Incrementing byte patterns create distinct but reproducible identifiers. This ensures tests produce the same identifiers across runs.

4. Unit Tests

Unit tests validate individual functions or components.

#![allow(unused)]
fn main() {
#[aura_test]
async fn test_single_function() -> aura_core::AuraResult<()> {
    let fixture = create_test_fixture().await?;
    let input = TestInput::new(42);
    let output = process_input(&fixture, input).await?;
    assert_eq!(output.value, 84);
    Ok(())
}
}

Each unit test should be fast and focused, testing one behavior per function. Name tests descriptively to communicate the expected behavior.

5. Integration Tests

Integration tests validate complete workflows across multiple components.

#![allow(unused)]
fn main() {
use aura_agent::runtime::AuraEffectSystem;
use aura_agent::AgentConfig;

#[aura_test]
async fn test_threshold_workflow() -> aura_core::AuraResult<()> {
    let fixture = create_test_fixture().await?;
    let device_ids: Vec<_> = (0..5)
        .map(|i| DeviceId::new_from_entropy([i as u8 + 1; 32]))
        .collect();

    let effect_systems: Result<Vec<_>, _> = (0..5)
        .map(|i| {
            AuraEffectSystem::simulation_for_named_test_with_salt(
                &AgentConfig::default(),
                "test_threshold_workflow",
                i as u64,
            )
        })
        .collect();

    let result = execute_protocol(&effect_systems?, &device_ids).await?;
    assert!(result.is_complete());
    Ok(())
}
}

Use simulation_for_test* helpers for all tests. For multi-instance tests from one callsite, use simulation_for_named_test_with_salt(...) and keep the identity and salt stable. This allows failures to be replayed deterministically.

6. Property-Based Testing

Property tests validate invariants across diverse inputs using proptest.

Synchronous Properties

#![allow(unused)]
fn main() {
use proptest::prelude::*;

fn arbitrary_message() -> impl Strategy<Value = Vec<u8>> {
    prop::collection::vec(any::<u8>(), 1..=1024)
}

proptest! {
    #[test]
    fn message_roundtrip(message in arbitrary_message()) {
        let encoded = encode(&message);
        let decoded = decode(&encoded).unwrap();
        assert_eq!(message, decoded);
    }
}
}

Synchronous property tests verify that invariants hold across randomly generated inputs. The arbitrary_message strategy produces byte vectors of varying length.

Async Properties

#![allow(unused)]
fn main() {
proptest! {
    #[test]
    fn async_property(data in arbitrary_message()) {
        tokio::runtime::Runtime::new().unwrap().block_on(async {
            let fixture = create_test_fixture().await.unwrap();
            let result = async_operation(&fixture, data).await;
            assert!(result.is_ok());
        });
    }
}
}

Async properties require creating a Tokio runtime inside the test body because proptest does not natively support async closures.

7. GuardSnapshot Pattern

The guard chain separates pure evaluation from async execution. This enables testing guard logic without an async runtime.

Testing Pure Guard Logic

#![allow(unused)]
fn main() {
#[test]
fn test_cap_guard_denies_unauthorized() {
    let snapshot = GuardSnapshot {
        capabilities: vec![],
        flow_budget: FlowBudget { limit: 100, spent: 0, epoch: 0 },
        ..Default::default()
    };
    let result = CapGuard::evaluate(&snapshot, &SendRequest::default());
    assert!(result.is_err());
}
}

This test verifies that the capability guard rejects requests when no capabilities are present. The GuardSnapshot captures the state needed for pure evaluation.

Testing Flow Budget

#![allow(unused)]
fn main() {
#[test]
fn test_flow_guard_blocks_over_budget() {
    let snapshot = GuardSnapshot {
        flow_budget: FlowBudget { limit: 100, spent: 95, epoch: 0 },
        ..Default::default()
    };
    let request = SendRequest { cost: 10, ..Default::default() };
    let result = FlowGuard::evaluate(&snapshot, &request);
    assert!(matches!(result.unwrap_err(), GuardError::BudgetExceeded));
}
}

This test verifies that the flow guard blocks sends when the requested cost would exceed the remaining budget.

8. TUI and CLI Testing

TUI State Machine Tests

#![allow(unused)]
fn main() {
mod support;
use support::TestTui;
use aura_terminal::tui::screens::Screen;

#[test]
fn test_screen_navigation() {
    let mut tui = TestTui::new();
    tui.assert_screen(Screen::Block);
    tui.send_char('2');
    tui.assert_screen(Screen::Neighborhood);
}
}

TUI state machine tests validate screen transitions and keyboard input handling without requiring a real terminal.

CLI Handler Testing

#![allow(unused)]
fn main() {
use aura_terminal::handlers::{CliOutput, HandlerContext};

#[tokio::test]
async fn test_status_handler() {
    let ctx = create_test_handler_context().await;
    let output = status::handle_status(&ctx).await.unwrap();
    let lines = output.stdout_lines();
    assert!(lines.iter().any(|l| l.contains("Authority")));
}
}

CLI handler tests exercise command handlers in isolation and assert against structured output lines.

Quint Trace Usage

Quint traces are model artifacts. Export them through the shared semantic scenario contract and execute real TUI and web flows through aura-harness rather than replaying Quint traces directly against the TUI implementation.

9. Conformance Testing

Conformance tests validate that implementations produce identical results across environments.

Conformance Lanes

CI runs two lanes. The strict lane compares native vs WASM cooperative execution. The differential lane compares native threaded vs cooperative execution.

# Strict lane
just ci-conformance-strict

# Differential lane
just ci-conformance-diff

# Both lanes
just ci-conformance

These commands run the conformance test suite and report any divergence between execution environments.

Mismatch Taxonomy

TypeDescriptionFix
strictByte-level differenceRemove hidden state or ordering-sensitive side effects
envelope_boundedOutside declared envelopesAdd or correct envelope classification
surface_missingRequired surface not presentEmit observable, scheduler_step, and effect

Reproducing Failures

AURA_CONFORMANCE_SCENARIO=scenario_name \
AURA_CONFORMANCE_SEED=42 \
cargo test -p aura-agent \
  --features choreo-backend-telltale-machine \
  --test telltale_machine_parity test_name \
  -- --nocapture

Set the scenario name and seed to reproduce a specific conformance failure deterministically.

10. Runtime Harness

The runtime harness executes real Aura instances in PTYs for end-to-end validation.

Harness Overview

The harness is the single executor for real frontend scenarios. Scripted mode uses the shared semantic scenario contract. Agent mode uses LLM-driven execution toward goals.

Shared flows should be authored semantically once, then executed through the harness using either the TUI or browser driver. Do not create a second frontend execution path for MBT or simulator replay. Core shared scenarios should use semantic actions and state-based assertions. Avoid raw selector steps, raw press_key steps, and label-based browser clicks except in dedicated low-level driver tests.

Run Config

schema_version = 1

[run]
name = "local-loopback-smoke"
pty_rows = 40
pty_cols = 120
seed = 4242

[[instances]]
id = "alice"
mode = "local"
data_dir = "artifacts/harness/state/local-loopback/alice"
device_id = "alice-dev-01"
bind_address = "127.0.0.1:41001"

The run config declares the execution environment. Each instance gets a unique data directory, device ID, and bind address. The seed ensures deterministic behavior across runs.

Scenario File

id = "discovery-smoke"
goal = "Validate semantic harness observation against a real TUI"

[[steps]]
id = "launch"
action = "launch_actors"
timeout_ms = 5000

[[steps]]
id = "nav-chat"
actor = "alice"
action = "navigate"
screen_id = "chat"
timeout_ms = 2000

[[steps]]
id = "chat-ready"
actor = "alice"
action = "readiness_is"
readiness = "ready"
timeout_ms = 2000

Scenarios define ordered steps with timeouts. Each step targets a specific actor and asserts a condition or triggers an action.

Running the Harness

# Lint before running
just harness-lint -- --config configs/harness/local-loopback.toml \
  --scenario scenarios/harness/semantic-observation-tui-smoke.toml

# Execute
just harness-run -- --config configs/harness/local-loopback.toml \
  --scenario scenarios/harness/semantic-observation-tui-smoke.toml

# Replay for deterministic reproduction
just harness-replay -- --bundle artifacts/harness/local-loopback-smoke/replay_bundle.json

Always lint before running to catch configuration errors early. Use replay bundles to reproduce failures deterministically.

Interactive Mode

Use tool_repl for manual validation.

cargo run -p aura-harness --bin tool_repl -- \
  --config configs/harness/local-loopback.toml

The REPL accepts JSON requests for screen inspection, key input, and wait conditions.

{"id":1,"method":"screen","params":{"instance_id":"alice"}}
{"id":2,"method":"send_keys","params":{"instance_id":"alice","keys":"3n"}}
{"id":3,"method":"wait_for","params":{"instance_id":"alice","pattern":"Create","timeout_ms":4000}}

These requests query screen state, send key sequences, and wait for patterns to appear in the rendered output.

Harness CI

just ci-harness-build
just ci-harness-contract
just ci-harness-replay
just ci-harness-matrix
just ci-shared-flow-policy

These commands build the harness, validate the contract, replay recorded coverage, run the full shared frontend matrix, and enforce shared-flow policy.

just ci-shared-flow-policy validates the shared-flow contract end to end. It checks that aura-app shared-flow support declarations are internally consistent. It verifies that every fully shared flow has explicit parity-scenario coverage and that required shell and modal ids still exist. It confirms browser control and field mappings still line up with the shared contract and that core shared scenarios have not drifted back to raw mechanics. The shared-flow aggregate now calls Aura policy code for the adaptive-privacy runtime-locality and legacy-sweep gates through toolkit/xtask, while the remaining shared-flow checks stay as thin shell orchestration around harness governance and targeted contract tests.

just ci-user-flow-policy is the diff-aware guidance gate for this surface. When shared UX contributor policy or parity-sensitive TUI and browser semantics change, update this guide in the same change so the user-flow guidance sync stays green. Local .claude skills may remain gitignored, but the authoritative contributor-facing testing guidance must still land in tracked docs.

The shared-flow policy scripts target the published Cargo package names for renamed Layer 6 crates and macros. When invoking raw Cargo commands behind these lanes, use hxrts-aura-app and hxrts-aura-macros package ids instead of the legacy aura-app and aura-macros selectors. File-system crate paths remain crates/aura-app and crates/aura-macros.

When shared flows export data through runtime events, the event payload is part of the contract. Invitation and device-enrollment code capture should come from RuntimeFact payloads in UiSnapshot.runtime_events, not clipboard scraping or frontend-local heuristics. Shared chat waits should bind to semantic selection state so the harness targets the single shared channel instead of falling back to incidental render order.

AMP transition waits follow that runtime-event rule. Shared scenarios for normal transition, delayed or offline witnesses, conflicting A2 certificates, subtractive membership, emergency quarantine, cryptoshred, rejected emergency attempts, cooldowns, duplicate-signing evidence, recovery replay, and authority-governance non-removal should wait on RuntimeEventKind::AmpChannelTransitionUpdated, parity snapshots, operation lifecycle, quiescence, and final reduced channel state.

These scenarios must stay actor-based and semantic-only. Raw DOM selectors, PTY keys, compatibility steps, and label-only assertions are diagnostic or frontend-conformance tools. They are not shared AMP evidence.

Use just ci-ui-parity-contract for the narrower parity gate. That lane validates shared screen and module mappings, shared-flow scenario coverage, and parity-manifest consistency without running a full scenario matrix.

11. Test Organization

Organize tests by category within each crate.

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    mod unit {
        #[aura_test]
        async fn test_single_function() -> aura_core::AuraResult<()> { Ok(()) }
    }

    mod integration {
        #[aura_test]
        async fn test_full_workflow() -> aura_core::AuraResult<()> { Ok(()) }
    }

    mod properties {
        proptest! {
            #[test]
            fn invariant_holds(input in any::<u64>()) {
                assert!(input == input);
            }
        }
    }
}
}

Grouping tests by category makes it easy to run subsets and understand coverage at a glance.

Running Tests

# All tests
just test

# Specific crate
just test-crate aura-agent

# With output
cargo test --workspace -- --nocapture

# TUI state machine tests
cargo test --package aura-terminal --test unit_state_machine

Use just test for the full suite. Use just test-crate for focused iteration on a single crate.

12. Best Practices

Test one behavior per function and name tests descriptively. Use fixtures for common setup. Prefer real handlers over mocks. Test error conditions explicitly. Avoid testing implementation details and focus on observable behavior.

Keep tests fast and parallelize independent tests.

13. Holepunch Backends and Artifact Triage

Use the harness --network-backend option to select execution mode.

# Deterministic local backend
cargo run -p aura-harness --bin aura-harness -- \
  run --config configs/harness/local-loopback.toml \
  --network-backend mock

# Native Linux Patchbay (requires Linux + userns/capabilities)
cargo run -p aura-harness --bin aura-harness -- \
  run --config configs/harness/local-loopback.toml \
  --network-backend patchbay

# Cross-platform VM runner (macOS/Linux)
cargo run -p aura-harness --bin aura-harness -- \
  run --config configs/harness/local-loopback.toml \
  --network-backend patchbay-vm

Patchbay is the authoritative NAT-realism backend for holepunch validation. Use native patchbay on Linux CI and Linux developers when capabilities are available. Use patchbay-vm on macOS and as Linux fallback to run the same scenarios in a Linux VM. Keep deterministic non-network logic in mock backend tests to preserve fast feedback.

patchbay-vm relies on the explicit harness work and artifact directories and QEMU_VM_WORK_DIR. The removed .qemu-vm redirect path is no longer part of the supported workflow.

Backend Resolution

The harness writes backend resolution details to artifacts/harness/<run>/network_backend_preflight.json. Implementation follows three tiers. Tier 1 covers deterministic and property tests in aura-testkit for retry and path-selection invariants. Tier 2 covers Patchbay integration scenarios in aura-harness for PR gating. Tier 3 covers Patchbay stress and flake detection suites on scheduled CI.

Triaging Failures

When a scenario fails, triage artifacts in this order.

  1. Check network_backend_preflight.json to confirm selected backend and fallback reason.
  2. Check startup_summary.json and scenario_report.json for run context and failing step.
  3. Check events.json and backend timeline artifacts for event ordering.
  4. Check namespace and network dumps and pcap files for packet and routing diagnosis.
  5. Check agent logs for authority-local failures and retry state transitions.

For harness-specific state debugging, treat timeout_diagnostics.json as the first failure bundle. It includes semantic state snapshots, render readiness, and runtime event history.

14. Browser Harness Workflow

Use this flow to run harness scenarios in browser mode with WASM and Playwright.

# 1) Check wasm/frontend compilation
just web-check

# 2) Install/update Playwright driver deps
cd crates/aura-harness/playwright-driver
npm ci
npm run install-browsers
npm test
cd ../..

# 3) Serve the web app
just web-serve

These steps verify WASM compilation, install Playwright dependencies, and start the web server.

In a second shell, run the browser scenarios.

# Lint browser run/scenario config
just harness-lint-browser scenarios/harness/semantic-observation-browser-smoke.toml

# Run browser scenarios
just harness-run-browser scenarios/harness/semantic-observation-browser-smoke.toml

# Replay the latest browser run bundle
just harness-replay-browser

Browser harness artifacts are written under artifacts/harness/browser/.

Debugging Browser Failures

Check web-serve.log for bundle and runtime startup issues. Check preflight_report.json for browser prerequisites including Node, Playwright, and app URL. Check timeout_diagnostics.json for authoritative and normalized snapshots and per-instance log tails. Playwright screenshots and traces are stored under each instance data_dir in playwright-artifacts/.

timeout_diagnostics.json is the primary authoritative failure bundle. It contains UiSnapshot, runtime event history through runtime_events, operation lifecycle and instance ids, and render and readiness diagnostics along with backend log tails.

For mixed-runtime debugging, inspect runtime_events before logs when a code exchange or chat handoff fails. The expected evidence is a typed event payload, the selected semantic target in the snapshot, and only then supporting browser or TUI render diagnostics. For browser runs, the harness observes the semantic state contract first and uses DOM and text fallbacks only for diagnostics. If semantic state and rendered UI diverge, treat that as a product or frontend contract bug rather than papering over it with text-based assertions.

Frontend Shell Roadmap

aura-ui is the shared Dioxus UI core. It supports web-first delivery today and future multi-target shells.

  1. aura-web (current): browser shell and harness bridge
  2. Desktop shell (future): desktop-specific shell reusing aura-ui
  3. Mobile shell (future): mobile-specific shell reusing aura-ui

Simulation Guide

This guide covers how to use Aura's simulation infrastructure for testing distributed protocols under controlled conditions.

When to Use Simulation

Simulation suits scenarios that unit tests cannot address. Use simulation for fault injection testing. Use it for multi-participant protocol testing. Use it for time-dependent behavior validation.

Do not use simulation for simple unit tests. Direct effect handler testing is faster and simpler for single-component validation. Do not treat simulation as the default end-to-end correctness oracle for user-facing flows. Aura's primary feedback loop remains the real-runtime harness running against the real software stack.

See Simulation Infrastructure Reference for the complete architecture documentation.

Simulation vs Harness

The simulation architecture is specified in Simulation Infrastructure Reference. The harness architecture is specified in User Flow Harness.

Use the real-runtime harness by default when validating product behavior through the TUI or webapp. Use simulation when you need deterministic virtual time, controlled network faults, scheduler control, or MBT and trace replay under constrained distributed conditions. Promote high-value simulation findings back into real-runtime harness coverage when the flow is user-visible or integration-sensitive.

Two Simulation Systems

The two simulation systems (TOML scenarios and Quint actions) are specified in Simulation Infrastructure Reference.

Use CaseSystem
End-to-end integrationTOML scenarios
Named fault injectionTOML scenarios
Conformance testingQuint actions
State space explorationQuint actions

When you need user-facing coverage, promote the scenario into the real-runtime harness lane after it is stable in simulation. Treat simulation as a substrate for controlled runtime conditions, not as the final UI executor.

TOML Scenario Authoring

Creating Scenarios

Scenario files live in the scenarios/ directory.

[metadata]
name = "recovery_basic"
description = "Basic guardian recovery flow"

[[phases]]
name = "setup"
actions = [
    { type = "create_participant", id = "owner" },
    { type = "create_participant", id = "guardian1" },
    { type = "create_participant", id = "guardian2" },
]

[[phases]]
name = "recovery"
actions = [
    { type = "run_choreography", choreography = "guardian_recovery", participants = ["owner", "guardian1", "guardian2"] },
]

[[properties]]
name = "owner_recovered"
property_type = "safety"

Each scenario has metadata, ordered phases, and property definitions.

Defining Phases

Phases execute in order. Each phase contains a list of actions.

[[phases]]
name = "fault_injection"
actions = [
    { type = "apply_network_condition", condition = "partition", duration = "5s" },
    { type = "advance_time", duration = "10s" },
]

Actions within a phase execute sequentially. Use multiple phases to organize complex scenarios.

Adding Fault Injection

Fault injection actions simulate adverse conditions.

[[phases]]
name = "chaos"
actions = [
    { type = "simulate_data_loss", participant = "guardian1", percentage = 50 },
    { type = "apply_network_condition", condition = "high_latency", duration = "30s" },
]

Available conditions include partition, high_latency, packet_loss, and byzantine.

Running Scenarios

cargo run --package aura-terminal -- scenario run scenarios/recovery_basic.toml

The scenario handler parses and executes the TOML file. Results report property pass/fail status.

Working with Handlers

Basic Handler Composition

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationEffectComposer;
use aura_core::DeviceId;

let device_id = DeviceId::new_from_entropy([1u8; 32]);
let composer = SimulationEffectComposer::for_testing(device_id).await?;
let env = composer
    .with_time_control()
    .with_fault_injection()
    .build()?;
}

The composer builds a complete effect environment from handler components.

Time Control

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationTimeHandler;

let mut time = SimulationTimeHandler::new();
let start = time.physical_time().await?;
time.jump_to_time(Duration::from_secs(60));
let later = time.physical_time().await?;
}

Simulated time advances only through explicit calls (jump_to_time) or sleep operations. This enables testing timeout behavior without delays.

Fault Injection

#![allow(unused)]
fn main() {
use aura_simulator::handlers::SimulationFaultHandler;
use aura_core::{AuraFault, AuraFaultKind, FaultEdge};

let faults = SimulationFaultHandler::new(42);

faults.inject_fault(
    AuraFault::new(AuraFaultKind::MessageDelay {
        edge: FaultEdge::new("alice", "bob"),
        min: Duration::from_millis(100),
        max: Duration::from_millis(500),
    }),
    None,
)?;

faults.inject_fault(
    AuraFault::new(AuraFaultKind::MessageDrop {
        edge: FaultEdge::new("alice", "bob"),
        probability: 0.1,
    }),
    None,
)?;
}

AuraFault is the canonical simulator fault model. Legacy scenario fault forms should be converted to AuraFault before injection or replay.

Triggered Scenarios

#![allow(unused)]
fn main() {
use aura_simulator::handlers::{
    SimulationScenarioHandler,
    ScenarioDefinition,
    TriggerCondition,
    InjectionAction,
};
use aura_core::{AuraFault, AuraFaultKind};

let handler = SimulationScenarioHandler::new(42);
handler.register_scenario(ScenarioDefinition {
    id: "late_partition".to_string(),
    name: "Late Partition".to_string(),
    trigger: TriggerCondition::AfterTime(Duration::from_secs(30)),
    actions: vec![InjectionAction::TriggerFault {
        fault: AuraFault::new(AuraFaultKind::NetworkPartition {
            partition: vec![vec!["device1".into(), "device2".into()], vec!["device3".into()]],
            duration: Some(Duration::from_secs(15)),
        }),
    },
    duration: Some(Duration::from_secs(45)),
    priority: 10,
});
}

Triggered scenarios inject faults at specific times or protocol states.

Integrating Feature Crates

Layer 5 feature crates (sync, recovery, chat, etc.) integrate with simulation through the effect system. This section covers patterns for wiring feature crates into simulation environments.

Required Effects

Feature crates are generic over effect traits. Common requirements include:

Effect TraitPurpose
NetworkEffectsTransport and peer communication
JournalEffectsFact retrieval and commits
CryptoEffectsHashing and signature verification
PhysicalTimeEffectsTimeouts and scheduling
RandomEffectsNonce generation

Pass the effect system from aura-simulator or aura-testkit for deterministic testing. In production, use aura-agent's runtime effects.

Configuration for Simulation

Feature crates typically provide testing configurations that minimize timeouts and remove jitter:

#![allow(unused)]
fn main() {
use aura_sync::SyncConfig;

// Production: conservative timeouts, adaptive scheduling
let prod_config = SyncConfig::for_production();

// Testing: fast timeouts, no jitter, predictable behavior
let test_config = SyncConfig::for_testing();

// Validate before use
test_config.validate()?;
}

Environment variables (prefixed per-crate, e.g., AURA_SYNC_*) allow per-process tuning without code changes.

Guard Chain Integration

The guard chain sequence is specified in Authorization.

For simulation, capability checks rely on Biscuit tokens evaluated by AuthorizationEffects. Guard evaluators must be provided by the runtime before sync operations. Validation occurs before sending or applying any protocol data.

Observability in Simulation

Connect feature crates to MetricsCollector for simulation diagnostics:

#![allow(unused)]
fn main() {
use aura_core::metrics::MetricsCollector;

let metrics = MetricsCollector::new();
// Protocol timings, retries, and failure reasons flow to metrics
// Log transport and authorization failures for debugging
}

Safety Requirements

Feature crates must follow effect system rules. All I/O and timing must flow through effects with no direct runtime calls. Validate Biscuit tokens before accepting peer data. Enforce flow budgets and leakage constraints at transport boundaries.

Verify compliance before simulation:

just ci-effects

Debugging Simulations

Deterministic Configuration

Always use deterministic settings for reproducible debugging.

#![allow(unused)]
fn main() {
let config = SimulatorConfig {
    device_id: DeviceId::new_from_entropy([1u8; 32]),
    network: NetworkConfig::default(),
    enable_fault_injection: false,
    deterministic_time: true,
};
}

Deterministic identifiers and time enable exact failure reproduction.

Effect System Compliance

Verify protocol code follows effect guidelines before simulation.

just check-arch

The architecture checker flags direct time, randomness, or I/O usage. Non-compliant code breaks simulation determinism.

Monitoring State

#![allow(unused)]
fn main() {
let metrics = middleware.get_metrics();
println!("Messages: {}", metrics.messages_sent);
println!("Faults: {}", metrics.faults_injected);
println!("Duration: {:?}", metrics.simulation_duration);
}

Middleware metrics help identify unexpected behavior.

Common Issues

Flaky simulation results indicate non-determinism. Check for direct system calls. Check for uncontrolled concurrency. Check for ordering assumptions.

Slow simulations indicate inefficient fault configuration. Reduce fault rates for initial debugging. Increase rates for stress testing.

Online Property Monitoring

Aura simulator supports per-tick property monitoring through aura_simulator::AuraProperty, aura_simulator::AuraPropertyMonitor, and aura_simulator::default_property_suite(...).

The monitor checks properties on each simulation tick using PropertyStateSnapshot input. This includes events, buffer sizes, local-type depths, flow budgets, and optional session, coroutine, and journal snapshots.

#![allow(unused)]
fn main() {
use aura_simulator::{
    AuraPropertyMonitor, ProtocolPropertyClass, ProtocolPropertySuiteIds,
    PropertyMonitoringConfig, SimulationScenarioConfig,
};

let monitoring = PropertyMonitoringConfig::new(
    ProtocolPropertyClass::Consensus,
    ProtocolPropertySuiteIds { session, context },
)
.with_check_interval(1)
.with_snapshot_provider(|tick| build_snapshot_for_tick(tick));

let config = SimulationScenarioConfig {
    property_monitoring: Some(monitoring),
    ..SimulationScenarioConfig::default()
};

let results = env.run_scenario("consensus".into(), "with property checks".into(), config).await?;
assert!(results.property_violations.is_empty());
}

Default suites are available for consensus, sync, chat, and recovery protocol classes. Scenario results include properties_checked and property_violations for CI reporting.

Adaptive Privacy Simulation

Adaptive privacy validation uses simulation for cross-layer evidence. The simulator exercises anonymous path establishment, established-path reuse and expiry, move batching, cover traffic, selector-based retrieval, hold retention, provider saturation, partitions, churn, and long-offline bootstrap re-entry.

The adaptive privacy matrix should include small, medium, and large reachable sets. It should also include clustered social topologies, partition and heal cycles, sparse sync opportunities, low organic traffic, provider saturation, and stale-node return with dead remembered descriptors.

The evidence path tunes policy constants for production. Production ships one fixed adaptive policy. Development and simulation may vary constants such as cover floor, path-diversity floor, delay gain, hold retention, and retrieval rotation, but those tuning surfaces must not become user-facing production configuration.

The current fixed-policy evidence is anchored by crates/aura-simulator/tests/adaptive_privacy_phase_six.rs. That lane checks the tuned policy, archived control-plane reports, bootstrap-observer reports, Telltale-backed anonymous path establishment, and reply-block accountability.

Quint Integration

Quint actions enable model-based testing. See Verification and MBT Guide for complete workflows.

When to Use Quint

Use Quint actions for conformance testing against formal specifications. Use them for generative state exploration. Do not use them for simple integration tests.

Basic Trace Replay

#![allow(unused)]
fn main() {
use aura_simulator::quint::itf_loader::ITFLoader;
use aura_simulator::quint::generative_simulator::GenerativeSimulator;

let trace = ITFLoader::load("trace.itf.json")?;
let simulator = GenerativeSimulator::new(config)?;
let result = simulator.replay_trace(&trace).await?;
assert!(result.all_properties_passed());
}

Trace replay validates implementation against Quint model behavior.

AMP epoch-transition simulation has a focused regression lane in crates/aura-simulator/tests/amp_transition_scenarios.rs. It covers delayed witnesses, partitions that create conflicting A2 certificates, subtractive old-epoch receive cutover, emergency quarantine, emergency cryptoshred, replay, abort, supersession, and alarm-spam cases. Run it with cargo test -p aura-simulator --test amp_transition_scenarios; broader local development can continue to use just test-crate aura-simulator without requiring an exhaustive Quint/Apalache matrix.

Conformance Workflow

Simulation feeds native/WASM conformance testing. See Testing Guide for conformance lanes and corpus policy.

Generating Corpus

quint run --out-itf=trace.itf.json verification/quint/consensus/core.qnt

ITF traces from Quint become conformance test inputs.

Running Conformance

just ci-conformance

Conformance lanes compare execution across platforms using simulation-controlled environments.

Checkpoints, Contracts, and Shared Replay

Phase 0 hardening uses three simulator workflows as CI gates.

Checkpoint Snapshot Workflow

SimulationScenarioHandler supports portable checkpoint snapshots. The export_checkpoint_snapshot(label) function exports a serializable ScenarioCheckpointSnapshot. The import_checkpoint_snapshot(snapshot) function restores a checkpoint into a fresh simulator instance.

This enables baseline checkpoint persistence for representative choreography suites. It also supports restore-and-continue regression tests. Upgrade smoke tests can resume from pre-upgrade snapshots.

Use checkpoints when validating runtime upgrades or migration safety, not only end-to-end success.

Scenario Contract Workflow

Conformance CI includes scenario contracts for consensus, sync, recovery, and reconfiguration. Each bundle is validated over a seed corpus. Validation checks terminal status (AllDone), required labels observed in trace, and minimum observable event count.

Contract results are written as JSON artifacts. CI fails on any violation with structured output.

Shared Replay Workflow

Replay-heavy parity lanes should use shared replay APIs. Use run_replay_shared(...) and run_concurrent_replay_shared(...) for this purpose.

These APIs reduce duplicate replay state across lanes and keep replay artifacts compatible with canonical trace fragments. Conformance lanes also emit deterministic replay metrics artifacts so regressions in replay footprint are visible during CI review.

For fault-aware replays, persist entries + faults bundles and re-inject faults before replay. Use aura_testkit::ReplayTrace::load_file(...) to load traces. Use ReplayTrace::replay_faults(...) and aura_simulator::AsyncSimulatorHostBridge::replay_expected_with_faults(...) to replay with faults.

Differential Replay Workflow

Use aura_simulator::DifferentialTester to compare baseline and candidate conformance artifacts. Two profiles are available. The strict profile requires byte-identical surfaces. The envelope_bounded profile uses Aura law-aware comparison with commutative and algebraic envelopes. In supported Telltale 11-backed report lanes, this is the low-level surface comparator, not the theorem-facing semantic authority: use the parity report's upstream semantic summary and run context.

For Telltale 11-backed parity lanes, prefer aura_simulator::run_telltale_parity_with_runner(...) or aura_simulator::run_telltale_control_plane_with_runner(...) over manually assembling candidate upstream sidecar paths. These helpers invoke the configured Telltale runner command, attach the generated candidate run-output sidecar, and then emit the normal Aura parity report. Supported file lanes still require a baseline upstream run sidecar, and they attach optional decision/sweep sidecars when available. Override the command with AURA_TELLTALE_SIMULATOR_RUNNER when needed.

For environment-oriented simulation state, Aura now exposes a bridge layer that uses Telltale 11-aligned names for migrated slices. Prefer the scenario-run artifact lane over ad hoc handler inspection: after run_scenario(...), inspect SimulationResults::environment_artifacts and the persisted environment_snapshot.json / environment_trace.json files under the configured simulator artifact root. When richer Aura-only environment detail is needed, read the optional environment_overlay.json supplement instead of expanding the core bridge schema.

For comparative experiment work, use aura_simulator::run_adaptive_privacy_policy_sweep(...) plus aura_simulator::compare_policy_sweeps(...) instead of ad hoc local sweep scripts. Those APIs keep Aura-specific bindings on top of the shared Telltale execution machinery while emitting Aura-owned AuraSweepArchiveV1 and AuraPolicyDiffReportV1 artifacts.

For reusable regression bundles, use aura_simulator::run_suite_catalog(...) and aura_simulator::compare_suite_catalogs(...). The suite catalogs are Aura-owned, but execution still goes through the shared Telltale harness and the archived comparison surface is AuraSuiteTournamentReportV1.

For theorem-aware failure evidence, convert parity reports with aura_simulator::counterexample_from_parity_report(...) or aura_simulator::counterexample_from_control_plane_report(...). These helpers preserve the shared Telltale counterexample witness and record whether the observed mismatch is schedule noise only.

For parity debugging, run:

just ci-choreo-parity
aura replay --trace-file artifacts/choreo-parity/native_replay/<scenario>__seed_<seed>.json

The replay command validates required conformance surfaces and verifies stored step/run digests against recomputed values.

Best Practices

Start with simple scenarios and add faults incrementally. Use deterministic seeds. Capture metrics for analysis.

Prefer TOML scenarios for human-readable tests. Prefer Quint actions for specification conformance. Combine both for comprehensive coverage.

Verification and MBT Guide

This guide covers how to use formal verification and model-based testing to validate Aura protocols. It focuses on practical workflows with Quint, Lean, and generative testing.

Quint, simulator, and harness have distinct responsibilities. Quint defines models, traces, and invariants. The aura-simulator crate is a selectable deterministic runtime substrate. The aura-harness crate is the single executor for real TUI and web frontend flows. Shared semantic UI and scenario contracts live in aura-app.

aura-app is also the home of the shared-flow support and parity contract used by the real-runtime harness. This includes SharedFlowId, SHARED_FLOW_SUPPORT, SHARED_FLOW_SCENARIO_COVERAGE, UiSnapshot, semantic parity comparison helpers, and typed runtime event diagnostics.

When to Verify

Verification suits protocols with complex state machines or security-critical properties. Use Quint model checking for exhaustive state exploration. Use Lean proofs for mathematical guarantees. Use generative testing to validate implementations against models.

Unit tests suffice for simple, well-understood behavior. Do not over-invest in verification for straightforward code.

See Formal Verification Reference for the complete architecture documentation.

Writing Quint Specifications

Getting Started

Create a new specification in verification/quint/.

module protocol_example {
    type State = { phase: str, value: int }
    var state: State

    action init = {
        state' = { phase: "setup", value: 0 }
    }

    action increment(amount: int): bool = all {
        amount > 0,
        state' = { ...state, value: state.value + amount }
    }

    val nonNegative = state.value >= 0
}

Run quint typecheck to validate syntax. Run quint run to simulate execution.

Authority Model

Specifications should use AuthorityId for identity, not DeviceId. Model relational semantics without device-level details.

type AuthorityId = str
type Participant = { authority: AuthorityId, role: Role }

This aligns specifications with Aura's authority-centric design.

State Machine Design

Define clear phases with explicit transitions.

type Phase = Setup | Active | Completed | Failed

action transition(target: Phase): bool = all {
    state.phase != Completed,
    state.phase != Failed,
    validTransition(state.phase, target),
    state' = { ...state, phase: target }
}

Disallow transitions from terminal states. Validate transition legality explicitly.

Invariant Design

Define invariants before actions. Clear invariants guide action design.

val safetyInvariant = or {
    state.phase != Failed,
    hasRecoveryPath(state)
}

val progressInvariant = state.step < MAX_STEPS

Invariants should be checkable at every state. Avoid invariants that require execution history.

Harness Modules

Create harness modules for simulation and trace generation.

module harness_example {
    import protocol_example.*

    action register(): bool = init
    action step(amount: int): bool = increment(amount)
    action done(): bool = state.phase == Completed
}

Harnesses provide standardized entry points for tooling. They should emit semantic traces and invariants, not frontend-specific scripts or key sequences.

Model Checking Workflow

Type Checking

quint typecheck verification/quint/protocol_example.qnt

Type checking validates syntax and catches type errors. Run it before any other operation.

Simulation

quint run --main=harness_example verification/quint/protocol_example.qnt

Simulation executes random traces. It finds bugs quickly but does not provide exhaustive coverage.

Invariant Checking

quint run --invariant=safetyInvariant verification/quint/protocol_example.qnt

Invariant checking verifies properties hold across simulated traces.

Model Checking with Apalache

quint verify --max-steps=10 --invariant=safetyInvariant verification/quint/protocol_example.qnt

Apalache performs exhaustive model checking. It proves invariants hold for all reachable states up to the step bound.

Interpreting Violations

Violations produce counterexample traces. The trace shows the state sequence leading to the violated invariant.

[State 0] phase: Setup, value: 0
[State 1] phase: Active, value: 5
[State 2] phase: Active, value: -3  <- VIOLATION: nonNegative

Use counterexamples to identify specification bugs or missing preconditions.

Generative Testing Workflow

Generative testing validates Rust implementations against Quint models.

For AMP channel epoch transitions, the Quint model in verification/quint/amp/channel.qnt includes transition identity, reducer states, A2Live, A2Conflict, A3Finalized, A3Conflict, abort, supersession, and emergency non-retroactivity semantics. The focused Rust conformance lane is cargo test -p aura-simulator --test amp_transition_scenarios; it compares proposal/certificate/finalization/abort/conflict sequences against a small deterministic oracle and exercises old-epoch acceptance boundaries.

The Trust Chain

Quint Specification
       │
       ▼ generates
   ITF Traces
       │
       ▼ replayed through
   Rust Effect Handlers
       │
       ▼ produces
   Property Verdicts

Each link adds verification value. Specifications validate design. Traces validate reachability. Replay validates implementation.

Generating Semantic Traces

just quint-semantic-trace spec=verification/quint/harness/flows.qnt \
  out=verification/quint/traces/harness_flows.itf.json

ITF traces capture semantic state sequences and non-deterministic choices. These traces are model artifacts. Real TUI and web execution belongs to the harness, which consumes the shared semantic scenario contract. Shared web/TUI parity assertions also run against the same UiSnapshot contract rather than renderer text or DOM structure.

Do not add direct Quint-to-TUI or Quint-to-browser execution paths. Quint should hand off semantic traces to the shared contract layer, then let the harness or simulator consume them through their own adapters.

For shared end-to-end flows, the harness contract is semantic and state-based. Do not introduce frontend-specific Quint replay formats that encode raw keypress sequences, browser selectors, or label-based button targeting. Those belong in driver adapters and diagnostics, not in the semantic trace contract.

Direct Conformance Testing

The recommended approach compares Rust behavior to Quint expected states.

#![allow(unused)]
fn main() {
use aura_simulator::quint::itf_loader::ITFLoader;

#[test]
fn test_matches_quint() {
    let trace = ITFLoader::load("trace.itf.json").unwrap();

    for states in trace.states.windows(2) {
        let pre = State::from_quint(&states[0]).unwrap();
        let action = states[1].meta.action.as_deref().unwrap();

        let actual = apply_action(&pre, action).unwrap();
        let expected = State::from_quint(&states[1]).unwrap();

        assert_eq!(actual, expected);
    }
}
}

This tests production code directly. Quint serves as the single source of truth.

Generative Exploration

For state space exploration with Rust-driven non-determinism, use the action registry.

#![allow(unused)]
fn main() {
use aura_simulator::quint::action_registry::ActionRegistry;

let mut registry = ActionRegistry::new();
registry.register("increment", Box::new(IncrementHandler));

let result = registry.execute("increment", &params, &effects).await?;
}

This approach requires handlers that re-implement Quint logic. Prefer direct conformance testing for new protocols.

ITF Trace Handling

Loading Traces

#![allow(unused)]
fn main() {
use aura_simulator::quint::itf_loader::ITFLoader;

let trace = ITFLoader::load("trace.itf.json")?;
for state in &trace.states {
    let index = state.meta.index;
    let action = state.meta.action.as_deref();
    let picks = &state.meta.nondet_picks;
}
}

The loader parses ITF JSON into typed Rust structures.

Non-Deterministic Choices

ITF traces capture non-deterministic choices for reproducible replay.

{
  "#meta": { "index": 3, "nondet_picks": { "leader": "alice" } }
}

The simulator injects these choices into RandomEffects to ensure deterministic replay.

State Mapping

Types implementing QuintMappable convert between Quint and Rust representations.

#![allow(unused)]
fn main() {
use aura_core::effects::quint::QuintMappable;

let rust_state = State::from_quint(&quint_value)?;
let quint_value = rust_state.to_quint();
}

Bidirectional mapping enables state comparison during replay.

Feeding Conformance Corpus

MBT traces should feed conformance testing.

Deriving Seeds

AURA_CONFORMANCE_ITF_TRACE=trace.itf.json cargo test conformance

ITF traces become inputs for native/WASM conformance lanes.

Coupling Model to Corpus

When Quint models change, regenerate traces and update the conformance corpus. This couples model evolution to test coverage.

quint run --out-itf=artifacts/traces/new_trace.itf.json verification/quint/updated_spec.qnt
just ci-conformance

See Testing Guide for corpus policy details.

The main repository policy gate for shared-flow drift is:

just ci-shared-flow-policy

This complements Quint and simulator checks by enforcing that shared real-runtime scenarios still use the semantic contract and that the shared-flow support map in aura-app remains consistent with the harness/frontend surfaces.

Telltale Verification Workflow in Aura

Use this workflow for choreography and simulator-level verification that depends on Telltale-derived checks.

1) Choreography compatibility gate (CI/tooling)

Run:

nix develop --command scripts/check/protocol-compat.sh --self-test
nix develop --command just ci-protocol-compat

This validates that known-compatible fixture pairs pass async subtyping checks. It confirms known-breaking fixture pairs fail as expected. It ensures changed .tell files stay backward-compatible unless intentionally breaking.

Fixtures live in crates/aura-testkit/fixtures/protocol_compat/.

2) Macro-time coherence gate

Run:

nix develop --command cargo test -p aura-macros

This enforces compile-time coherence validation for choreographies, including negative compile-fail coverage.

3) Simulator invariant monitoring under injected faults

Run:

nix develop --command cargo test -p aura-simulator --test fault_invariant_monitor

This verifies that injected faults produce monitor-visible invariant violations (for example, NoFaults violations), and that a gate configured to require zero violations fails accordingly.

4) telltale-bridge integration status

As of April 1, 2026, Aura includes telltale-bridge as a workspace dependency and integrates it through aura-quint on the Telltale 10.0.0 public bridge/runtime model.

This adds direct access to upstream Lean runner and equivalence utilities from the Telltale project. It provides explicit schema and version linkage with upstream bridge contracts for cross-tool consistency. It also keeps Aura verification aligned with the same public capability/finalization/runtime-upgrade model used by production runtime code rather than a private compatibility layer. Read upstream docs/38_capability_model.md before debugging capability-driven admission, authoritative-read/finalization behavior, or runtime-upgrade proofs.

Call aura_quint::upstream_telltale_bridge_schema_version() to get the upstream bridge schema version. CI lanes are just ci-telltale-bridge and just ci-simulator-telltale-parity.

5) aura-testkit Lean verification API migration (March 5, 2026)

The canonical Lean API types are specified in Formal Verification Reference.

Import Lean verification payload types from aura_testkit::verification which re-exports from lean_types. Construct structured journals with LeanJournal and LeanNamespace. Update tests to compare LeanTimestampOrdering values directly.

Telltale Bridge Cross-Validation

The bridge connects Quint model checking with Telltale and Lean proof artifacts. It enables exporting Quint session models to a stable interchange format, importing Telltale and Lean properties back into Quint harnesses, and running cross-validation to detect divergence early in CI.

Operator Workflow

Run the bridge lane:

just ci-telltale-bridge

Inspect outputs at artifacts/telltale-bridge/bridge.log, artifacts/telltale-bridge/bridge_discrepancy_report.json, and artifacts/telltale-bridge/report.json.

Run the simulator telltale parity lane:

just ci-simulator-telltale-parity

Inspect output at artifacts/telltale-parity/report.json.

Data Contract

The bridge data contract is specified in Formal Verification Reference.

Export Workflow

Export moves session models from Quint to Telltale format.

  1. Parse Quint JSON IR with parse_quint_modules(...)
  2. Build the bridge bundle with export_quint_to_telltale_bundle(...)
  3. Validate structural correctness with validate_export_bundle(...)

Import Workflow

Import brings Telltale and Lean properties back into Quint harnesses.

  1. Select importable properties with parse_telltale_properties(...)
  2. Generate Quint invariant module text with generate_quint_invariant_module(...)
  3. Map certificates into Quint assertion comments with map_certificates_to_quint_assertions(...)

Cross-Validation Workflow

Cross-validation detects proof and model divergence. Use run_cross_validation(...) from aura-quint to execute Quint checks through a QuintModelCheckExecutor, compare outcomes to bridge proof certificates, and emit a CrossValidationReport with explicit discrepancy entries.

Run cross-validation in CI:

just ci-telltale-bridge

This command produces artifacts under artifacts/telltale-bridge/ including bridge.log and report.json.

Handling Discrepancies

When cross-validation reports discrepancies, follow these steps. First confirm the property identity mapping (property_id) between model and proof pipelines. Then re-run the failing property in Quint and capture the trace or counterexample. Next re-check proof certificate assumptions against the current protocol model. Do not merge until the mismatch is resolved or explicitly justified.

For telltale parity mismatches, read comparison_classification, first_mismatch_surface, and first_mismatch_step_index first. Re-run the failing lane with the same scenario and seed. Confirm that required surfaces (observable, scheduler_step, effect) were captured before examining envelope differences.

Lean Proof Development

Adding Theorems

Create or extend modules in verification/lean/Aura/Proofs/.

theorem new_property : ∀ s : State, isValid s → preservesInvariant s := by
  intro s h
  simp [isValid, preservesInvariant] at *
  exact h

Use Lean 4 tactic mode for proofs.

Using Claims Bundles

Access related theorems through claims bundles.

import Aura.Proofs.Consensus

#check Aura.Consensus.Validity.validityClaims.commit_has_threshold

Bundles organize proofs by domain.

Working with Axioms

Cryptographic assumptions appear in Assumptions.lean.

axiom frost_threshold_unforgeability : ...

Proofs depending on axioms are sound under standard hardness assumptions. Document axiom dependencies clearly.

Building Proofs

cd verification/lean
lake build

The build succeeds only if all proofs complete without sorry.

Checking Status

just lean-status

This reports per-module proof status including incomplete proofs.

Running Verification

Quint Commands

quint typecheck spec.qnt           # Type check
quint run --main=harness spec.qnt  # Simulate
quint run --invariant=inv spec.qnt # Check invariant
quint verify --max-steps=10 spec.qnt # Model check

Lean Commands

just verify-lean       # Build proofs
just lean-status       # Check status
just test-differential # Rust vs Lean tests

Full Verification

just verify-all

This runs Quint model checking, Lean proof building, and conformance tests.

Best Practices

Start with invariants. Define properties before implementing actions. Clear invariants guide design.

Use unique variant names. Quint requires globally unique sum type variants. Prefix with domain names.

Test harnesses separately. Verify harness modules parse before integrating with the simulator.

Start with short traces. Debug action mappings with 3-5 step traces before exhaustive exploration.

Isolate properties. Test one property at a time during development. Combine for coverage testing.

Adding or Updating Invariants

When adding or modifying invariants, follow this workflow to maintain traceability across docs, tests, and proofs.

  1. Add or update the invariant under ## Invariants in the crate's ARCHITECTURE.md.
  2. Add a detailed specification section in the same file with invariant name, enforcement locus, failure mode, and verification hooks.
  3. Use canonical InvariantXxx naming for traceability across docs, tests, and proofs.
  4. Add or update tests and simulator scenarios that detect violations.
  5. Update the traceability matrix in Project Structure if the invariant is cross-crate or contract-level.

Formal and model checks should reference the same canonical names listed in the traceability matrix.

Quint-Lean Correspondence

The Quint-Lean correspondence mapping is maintained in Formal Verification Reference.

See Formal Verification Reference for architecture details. See Simulation Guide for trace replay. See Testing Guide for conformance testing. See Project Structure for the invariant index and traceability matrix.

System Internals Guide

This guide covers deep system patterns for contributors working on Aura core. Use it when you need to understand guard chain internals, service layer patterns, core types, and reactive scheduling.

1. Guard Chain Internals

The guard chain coordinates authorization, flow budgets, and journal effects in strict sequence. See Authorization for the full specification.

Three-Phase Pattern

Guards are pure: evaluation runs synchronously over a prepared GuardSnapshot and yields EffectCommand items that an async interpreter executes.

#![allow(unused)]
fn main() {
// Phase 1: Authorization via Biscuit + policy (async, cached)
let token = effects.verify_biscuit(&request.token).await?;
let capabilities = evaluate_candidate_frontier(
    &token,
    evaluation_candidates_for_chat_guard(),
    &policy,
)?;

// Phase 2: Prepare snapshot and evaluate guards (sync)
let snapshot = GuardSnapshot {
    capabilities,
    flow_budget: current_budget,
    leakage_budget: current_leakage,
    ..Default::default()
};

let commands = guard_chain.evaluate(&snapshot, &request)?;

// Phase 3: Execute commands (async)
for command in commands {
    match command {
        EffectCommand::ChargeBudget { cost } => {
            budget_handler.charge(cost).await?;
        }
        EffectCommand::RecordLeakage { budget } => {
            leakage_handler.record(budget).await?;
        }
        EffectCommand::CommitJournal { facts } => {
            journal_handler.commit(facts).await?;
        }
        EffectCommand::SendTransport { message } => {
            transport_handler.send(message).await?;
        }
    }
}
}

No transport observable occurs until the interpreter executes commands in order.

Guard Chain Sequence

The guards execute in this order:

  1. CapabilityGuard validates the evaluated Biscuit and policy frontier.
  2. FlowBudgetGuard checks and charges flow budget.
  3. LeakageTracker records privacy leakage.
  4. JournalCoupler commits facts to journal.
  5. TransportEffects sends messages.

Security Patterns

Privacy budget enforcement supports three modes.

#![allow(unused)]
fn main() {
// Secure default: denies undefined budgets
let tracker = LeakageTracker::new();

// Backward compatibility: allows undefined budgets
let tracker = LeakageTracker::legacy_permissive();

// Configurable default
let tracker = LeakageTracker::with_undefined_policy(DefaultBudget(1000));
}

2. Service Layer Patterns

Domain crates define stateless handlers that take effect references per-call. The agent layer wraps these with services that manage RwLock access.

Domain Handler (Layer 2-5)

#![allow(unused)]
fn main() {
// In domain crate (e.g., aura-chat/src/service.rs)
pub struct ChatFactService;

impl ChatFactService {
    pub fn new() -> Self { Self }

    pub async fn send_message<E>(
        &self,
        effects: &E,
        channel_id: ChannelId,
        content: String,
    ) -> Result<MessageId>
    where
        E: StorageEffects + RandomEffects + PhysicalTimeEffects
    {
        let message_id = effects.random_uuid().await;
        let timestamp = effects.physical_time().await?;
        // ... domain logic using effects
        Ok(message_id)
    }
}
}

Agent Service Wrapper (Layer 6)

#![allow(unused)]
fn main() {
// In aura-agent/src/handlers/chat_service.rs
pub struct ChatService {
    handler: ChatFactService,
    effects: Arc<RwLock<AuraEffectSystem>>,
}

impl ChatService {
    pub fn new(effects: Arc<RwLock<AuraEffectSystem>>) -> Self {
        Self { handler: ChatFactService::new(), effects }
    }

    pub async fn send_message(
        &self,
        channel_id: ChannelId,
        content: String,
    ) -> AgentResult<MessageId> {
        let effects = self.effects.read().await;
        self.handler.send_message(&*effects, channel_id, content)
            .await
            .map_err(Into::into)
    }
}
}

Agent API Exposure

#![allow(unused)]
fn main() {
// In aura-agent/src/core/api.rs
impl AuraAgent {
    pub fn chat_service(&self) -> ChatService {
        ChatService::new(self.runtime.effects())
    }
}
}

This pattern keeps the domain crate pure without Tokio-specific locking or runtime coupling. Domain logic is testable with mock effects. The pattern is consistent across crates.

Core + Orchestrator Rule

The Core + Orchestrator Rule is defined in System Architecture. Layer 4 crates split logic into pure core modules and effectful orchestrator modules.

3. Type Reference

ProtocolType

Canonical definition in aura-core. All crates re-export this definition.

#![allow(unused)]
fn main() {
pub enum ProtocolType {
    Dkd,        // Deterministic Key Derivation
    Counter,    // Counter reservation protocol
    Resharing,  // Key resharing for threshold updates
    Locking,    // Resource locking protocol
    Recovery,   // Account recovery protocol
    Compaction, // Ledger compaction protocol
}
}

SessionStatus

Lifecycle order:

  1. Initializing - Session initializing
  2. Active - Session executing
  3. Waiting - Waiting for participant responses
  4. Completed - Completed successfully
  5. Failed - Failed with error
  6. Expired - Expired due to timeout
  7. TimedOut - Timed out during execution
  8. Cancelled - Was cancelled

TimeStamp Domains

The time domain system is specified in Effect System. See that document for domain definitions, effect trait mappings, and usage constraints.

Capability System Layering

The capability system uses multiple layers:

  • Canonical types in aura-core provide validated CapabilityName.
  • Owning families in feature and domain crates provide typed first-party capability declarations.
  • The authorization layer in aura-authorization handles explicit issuance profiles and Biscuit and policy evaluation.
  • Guard snapshots in aura-guards plus runtime handlers carry evaluated frontiers only.
  • The storage layer in aura-store provides capability-based access control.

Clear conversion paths enable inter-layer communication.

4. Reactive Scheduling

The ReactiveScheduler in aura-agent/src/reactive/ processes journal facts and emits application signals.

Signal System Overview

#![allow(unused)]
fn main() {
// Application signals
pub const CHAT_SIGNAL: &str = "chat";
pub const CONTACTS_SIGNAL: &str = "contacts";
pub const CHANNELS_SIGNAL: &str = "channels";
pub const RECOVERY_SIGNAL: &str = "recovery";
}

The scheduler:

  1. Subscribes to journal fact streams
  2. Reduces facts to view state
  3. Emits signals when state changes
  4. TUI/CLI components subscribe to signals

TUI Reactive State

The TUI uses futures-signals for fine-grained reactive state management.

The reactive architecture pattern below represents the target design for TUI state management. Implementation status varies by view.

Signal Types

#![allow(unused)]
fn main() {
use futures_signals::signal::Mutable;
use futures_signals::signal_vec::MutableVec;

// Single reactive value
let count = Mutable::new(0);
count.set(5);
let value = count.get_cloned();

// Reactive collection
let items = MutableVec::new();
items.lock_mut().push_cloned("item1");
}

View Pattern

#![allow(unused)]
fn main() {
pub struct ChatView {
    channels: MutableVec<Channel>,
    messages: MutableVec<Message>,
    selected_channel: Mutable<Option<String>>,
}

impl ChatView {
    // Synchronous delta application
    pub fn apply_delta(&self, delta: ChatDelta) {
        match delta {
            ChatDelta::ChannelAdded { channel } => {
                self.channels.lock_mut().push_cloned(channel);
                // Signals automatically notify subscribers
            }
            ChatDelta::MessageReceived { channel_id, message } => {
                if self.selected_channel.get_cloned() == Some(channel_id) {
                    self.messages.lock_mut().push_cloned(message);
                }
            }
        }
    }
}
}

Best Practices

  • Delta application should be synchronous (not async)
  • Use .get_cloned() for reading, .set() for mutations
  • Never hold lock guards across await points
  • Use derived signals for computed values

5. Policy Compliance

Application code must follow policies defined in Project Structure.

Impure Function Usage

All time, randomness, filesystem, and network operations must flow through effect traits.

Direct system calls are forbidden because they break simulation and WASM compatibility.

#![allow(unused)]
fn main() {
// Forbidden: direct system calls
let now = SystemTime::now();
let random = thread_rng().gen();
let file = File::open("path")?;
}

Use effect traits instead.

#![allow(unused)]
fn main() {
// Use effect traits
let now = effects.physical_time().await?;
let random = effects.random_bytes(32).await?;
let data = effects.read_storage("key").await?;
}

Serialization

  • Wire protocols and facts: DAG-CBOR via aura_core::util::serialization
  • User-facing configs: JSON allowed
  • Debug output: JSON allowed

Architectural Validation

Run just check-arch before submitting changes. The checker validates:

  • Layer boundaries
  • Effect trait placement
  • Impure function routing
  • Guard chain integrity

6. Architecture Compliance Checklist

  • Layer dependencies flow downward only
  • Effect traits defined in aura-core only
  • Infrastructure effects implemented in aura-effects
  • Application effects in domain crates
  • No direct impure function usage outside effect implementations
  • All async functions propagate EffectContext
  • Production handlers are stateless, test handlers in aura-testkit
  • Guard chain sequence respected

Workflow Error Types

Workflow operations in aura-app use WorkflowError (aura-app::workflows::error) for typed error propagation. The enum provides structured variants for common failure modes:

  • RuntimeUnavailable: runtime bridge not initialized
  • RuntimeCall { operation, source }: a named runtime bridge call failed
  • ConnectivityRequired: peer connectivity prerequisite not met
  • Journal { operation, source }: journal load/merge/persist failure
  • FactEncoding { source }: fact serialization failure
  • Ceremony { operation, source }: ceremony lifecycle failure
  • DeliveryFailed { peer, attempts, detail }: transport delivery exhausted retries
  • Precondition: static invariant violation

From<WorkflowError> for AuraError enables workflows to keep Result<T, AuraError> signatures while constructing typed errors internally.

7. Instrumentation Contract

The instrumentation contract is specified in Runtime. All long-lived services must emit the required event families defined there.

Distributed Maintenance Guide

This guide covers practical workflows for the Maintenance and OTA (Over-the-Air) update system in Aura. Use it for snapshots, cache invalidation, and distributed upgrades.

For the maintenance architecture specification, see Distributed Maintenance Architecture.

Maintenance Philosophy

Maintenance architectural principles are defined in Distributed Maintenance Architecture.

The system supports snapshots for garbage collection, cache management, and both soft and hard fork upgrades.

Maintenance Events

The maintenance service publishes events to the journal as facts. These events are replicated across all replicas and interpreted deterministically.

Event Types

Maintenance event types are defined in Distributed Maintenance Architecture.

Snapshot Protocol

The snapshot protocol coordinates garbage collection with threshold approval. It implements writer fencing to ensure consistent snapshot capture across all devices.

Snapshot Workflow

The snapshot process follows five steps:

  1. Propose snapshot at target epoch with state digest
  2. Activate writer fence to block concurrent writes
  3. Capture state and verify digest
  4. Collect M-of-N threshold approvals
  5. Commit snapshot and clean obsolete facts

Basic Snapshot Operation

#![allow(unused)]
fn main() {
use aura_sync::services::{MaintenanceService, MaintenanceServiceConfig};
use aura_core::{Epoch, Hash32};

async fn propose_snapshot(
    service: &MaintenanceService,
    authority_id: aura_core::AuthorityId,
    target_epoch: Epoch,
    state_digest: Hash32,
) -> Result<(), Box<dyn std::error::Error>> {
    // Propose snapshot at target epoch
    service
        .propose_snapshot(authority_id, target_epoch, state_digest)
        .await?;
    
    // Writer fence is now active - all concurrent writes blocked
    // Collect approvals from M-of-N authorities
    
    // Once threshold reached, commit
    service.commit_snapshot().await?;
    
    Ok(())
}
}

Snapshots provide deterministic checkpoints of authority state at specific epochs. This enables garbage collection of obsolete facts while maintaining verifiable state recovery.

Snapshot Proposal

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;
use aura_core::{AuthorityId, Epoch, Hash32};

async fn snapshot_workflow(
    service: &MaintenanceService,
    authority_id: AuthorityId,
) -> Result<(), Box<dyn std::error::Error>> {
    // Determine target epoch and compute state digest
    let target_epoch = 100;
    let current_state = service.get_current_state().await?;
    let state_digest = current_state.compute_digest();
    
    // Propose snapshot
    service
        .propose_snapshot(authority_id, Epoch::new(target_epoch), state_digest)
        .await?;
    
    // Wait for other authorities to activate fence
    // and collect approvals
    
    Ok(())
}
}

Proposals include the proposer authority identifier, unique proposal ID, target epoch, and canonical state digest. All participants must agree on the digest before committing.

Writer Fence

The writer fence blocks all writes during snapshot capture. This prevents concurrent modifications that could invalidate the snapshot digest.

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;

async fn capture_with_fence(
    service: &MaintenanceService,
) -> Result<(), Box<dyn std::error::Error>> {
    // Writer fence is automatically activated by snapshot proposal
    // All write operations will be blocked or queued
    
    // Capture state atomically
    let snapshot = service.capture_snapshot().await?;
    
    // Once snapshot is committed, fence is released
    service.commit_snapshot().await?;
    
    Ok(())
}
}

Fence enforcement is implicit in snapshot proposal. The protocol guarantees no conflicting writes occur during snapshot capture.

Approval Collection

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;

async fn collect_approvals(
    service: &MaintenanceService,
    threshold: usize,
) -> Result<(), Box<dyn std::error::Error>> {
    // Get pending snapshot proposals
    let proposals = service.pending_snapshots().await?;
    
    for proposal in proposals {
        // Verify state digest
        let current_state = service.get_current_state().await?;
        let digest = current_state.compute_digest();
        
        if digest == proposal.state_digest {
            // Approve the snapshot
            service.approve_snapshot(&proposal.proposal_id).await?;
        }
    }
    
    Ok(())
}
}

Each device verifies the state digest independently and approves if correct. The system collects approvals until threshold is reached.

Snapshot Commitment

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;

async fn finalize_snapshot(
    service: &MaintenanceService,
) -> Result<(), Box<dyn std::error::Error>> {
    // Commit snapshot once threshold reached
    service.commit_snapshot().await?;
    
    // This records SnapshotCompleted fact to journal
    // All devices deterministically reduce this fact
    // Obsolete facts before this epoch can be garbage collected
    
    Ok(())
}
}

Commitment publishes SnapshotCompleted fact to the journal. All devices deterministically reduce this fact to the same relational state. Facts older than the snapshot epoch can then be safely discarded.

OTA (Over-the-Air) Upgrade Protocol

Aura OTA is split into two layers:

  • release distribution: manifests, artifacts, and build certificates spread through Aura storage and anti-entropy
  • scoped activation: each device, authority, context, or managed quorum decides when a staged release may activate

There is no network-wide authoritative cutover phase for the whole Aura network. Hard cutover is valid only inside a scope that actually has agreement or a legitimate fence.

Upgrade Types

The upgrade classification (soft fork vs hard fork) is defined in Distributed Maintenance Architecture.

Basic Upgrade Operation

#![allow(unused)]
fn main() {
use aura_maintenance::AuraReleaseActivationPolicy;
use aura_sync::services::{ActivationCandidate, OtaPolicyEvaluator};

async fn evaluate_activation(
    policy: &AuraReleaseActivationPolicy,
    candidate: &ActivationCandidate<'_>,
) {
    // Activation is local or scope-bound.
    // Discovery and sharing do not imply activation.
    // The evaluator checks trust, compatibility, staged artifacts,
    // health gates, threshold approval, and scope-owned fences.
    let _decision = OtaPolicyEvaluator::new().evaluate_activation(policy, candidate);
}
}

If policy enables it, suggested_activation_time_unix_ms acts only as a local "not before" hint against the local clock. It is advisory metadata, not a global synchronization fence.

Soft Fork Workflow

#![allow(unused)]
fn main() {
use aura_maintenance::AuraCompatibilityClass;
use aura_sync::services::{cutover_session_plan, InFlightIncompatibilityAction};

fn soft_fork_workflow() {
    let plan = cutover_session_plan(
        AuraCompatibilityClass::BackwardCompatible,
        InFlightIncompatibilityAction::Drain,
        false,
    );
    assert!(!plan.partition_required);
}
}

Soft forks do not require a globally shared instant. Each scope moves from legacy-only residency to coexistence and then to target-only residency based on its own evidence and policy.

Hard Fork Workflow

#![allow(unused)]
fn main() {
use aura_maintenance::AuraCompatibilityClass;
use aura_sync::services::{cutover_session_plan, InFlightIncompatibilityAction};

fn execute_managed_quorum_cutover() {
    let plan = cutover_session_plan(
        AuraCompatibilityClass::ScopedHardFork,
        InFlightIncompatibilityAction::Delegate,
        true,
    );
    assert!(plan.partition_required || plan.in_flight == InFlightIncompatibilityAction::Delegate);
}
}

For hard forks, the operator must define:

  • the activation scope
  • the compatibility class
  • how incompatible in-flight sessions are handled: drain, abort, or delegate
  • whether threshold approval or an epoch fence is actually available in that scope

If post-cutover checks fail, rollback is explicit and deterministic.

Managed Quorum Approval Runbook

Use managed quorum cutover only when the scope has an explicit participant set. Record approval from every participant in the quorum before starting cutover. Reject approval from authorities that are not members of that scope.

If one participant has not approved, keep the scope waiting for cutover evidence. Do not begin launcher activation for that scope. Resolve membership or policy disagreement before retrying.

Failed Rollout Runbook

Check the failure classification before acting. AuraUpgradeFailureClass::HealthGateFailed means the new release started and failed local verification. AuraUpgradeFailureClass::LauncherActivationFailed means the launcher handoff failed before healthy activation.

If policy uses AuraRollbackPreference::Automatic, allow the queued rollback to execute and confirm the scope returns to legacy-only residency with an idle transition state. If policy uses AuraRollbackPreference::ManualApproval, keep the scope failed and require operator approval before rollback.

Revoked Release Runbook

Treat a revoked staged release differently from a revoked active release. If the target release is only staged, cancel the staged scope and remove it from activation consideration. Do not proceed to cutover for that scope.

If the revoked release is already active, follow the configured rollback preference. Automatic rollback should queue a rollback to the prior staged release. Manual rollback should leave the scope failed until an operator approves the revert path.

Partition Response Runbook

If SessionCompatibilityPlan.partition_required is true, assume incompatible peers may separate cleanly rather than interoperate. Stop admitting incompatible new sessions in that scope. Drain, abort, or delegate in-flight sessions according to the recorded incompatibility action.

Record partition observations with the associated failure classification and scope. This keeps rollback and peer-partition handling auditable.

Cache Management

The maintenance system coordinates cache invalidation across all devices. Cache invalidation facts specify which keys must be refreshed and the epoch where the cache entry is no longer valid.

Cache Invalidation

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;
use aura_core::Epoch;

async fn invalidate_cache(
    service: &MaintenanceService,
    keys: Vec<String>,
    epoch_floor: Epoch,
) -> Result<(), Box<dyn std::error::Error>> {
    // Publish cache invalidation fact
    service.invalidate_cache_keys(keys, epoch_floor).await?;
    
    // All devices receive fact through journal
    // Each device deterministically invalidates matching keys
    
    Ok(())
}
}

Cache invalidation facts are replicated through the journal. All devices apply the invalidation deterministically based on epoch and key matching.

Cache Query Patterns

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;
use aura_core::Epoch;

async fn query_cache(
    service: &MaintenanceService,
    key: &str,
    epoch: Epoch,
) -> Result<Option<Vec<u8>>, Box<dyn std::error::Error>> {
    // Check if cache entry is valid at epoch
    if service.is_cache_valid(key, epoch).await? {
        service.get_cached(key).await
    } else {
        // Cache invalidated - fetch fresh
        let value = service.fetch_fresh(key).await?;
        service.cache_set(key, value.clone(), epoch).await?;
        Ok(Some(value))
    }
}
}

Cache validity is epoch-based. Entries are valid up to their invalidation epoch. After that, fresh data must be fetched and cached at the new epoch.

Configuration and Best Practices

Service Configuration

#![allow(unused)]
fn main() {
use aura_sync::services::{MaintenanceService, MaintenanceServiceConfig};
use aura_sync::protocols::{SnapshotConfig, OTAConfig};
use std::time::Duration;

fn create_maintenance_service() -> Result<MaintenanceService, Box<dyn std::error::Error>> {
    let config = MaintenanceServiceConfig {
        snapshot: SnapshotConfig {
            proposal_timeout: Duration::from_secs(300),
            approval_timeout: Duration::from_secs(600),
            max_proposals: 10,
        },
        ota: OTAConfig {
            readiness_timeout: Duration::from_secs(3600),
            max_pending: 5,
            soft_fork_auto_activate: true,
        },
        cache: Default::default(),
    };
    
    Ok(MaintenanceService::new(config)?)
}
}

Configuration controls timeouts, limits, and behavior. Snapshot timeouts should be shorter than OTA staging and activation timeouts since snapshots are more frequent. OTA policies should separately configure discovery, sharing, and activation rather than bundling them into one setting.

Snapshot Best Practices

Keep snapshots frequent but not excessive. Snapshot every 100-500 epochs depending on journal size. Too frequent snapshots create overhead. Too infrequent snapshots reduce garbage collection effectiveness.

Always verify state digest before approving. Use canonical serialization for digest computation. Record all snapshots in the journal for audit trail.

Upgrade Best Practices

Plan upgrades carefully. Soft forks can be deployed flexibly inside one scope. Hard forks require a clear activation scope, an explicit compatibility class, and an operator decision about whether in-flight incompatible sessions drain, abort, or delegate.

Always test upgrades in simulation before deployment. Use threshold approval or epoch fences only in scopes that actually own those mechanisms. Treat suggested_activation_time_unix_ms as advisory rollout metadata, not as a coordination primitive.

Include rollback procedures for hard forks. Document migration paths for state format changes and keep launcher activation/rollback steps separate from the running runtime.

Cache Best Practices

Invalidate cache conservatively. Over-invalidation reduces performance. Under-invalidation risks stale data.

Use epoch floors to scope invalidation. Invalidate only keys that actually changed at that epoch.

Monitor cache hit rates. Low hit rates indicate invalidation is too aggressive.

Monitoring and Debugging

Snapshot Status

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;

async fn monitor_snapshots(
    service: &MaintenanceService,
) -> Result<(), Box<dyn std::error::Error>> {
    let status = service.snapshot_status().await?;
    
    println!("Last snapshot: epoch {}", status.last_snapshot_epoch);
    println!("Pending proposals: {}", status.pending_proposals);
    println!("Writer fence active: {}", status.fence_active);
    
    Ok(())
}
}

Check snapshot status regularly to ensure the snapshot cycle is healthy. Long intervals between snapshots may indicate approval delays.

Upgrade Status

#![allow(unused)]
fn main() {
use aura_sync::services::MaintenanceService;

async fn monitor_upgrades(
    service: &MaintenanceService,
) -> Result<(), Box<dyn std::error::Error>> {
    let status = service.upgrade_status().await?;
    
    for upgrade in status.active_upgrades {
        println!("Upgrade: version {}", upgrade.version);
        println!("  Ready devices: {}/{}", upgrade.ready_count, upgrade.total_devices);
        println!("  Threshold: {}", upgrade.threshold);
    }
    
    Ok(())
}
}

Monitor upgrade progress to catch devices that are not ready. Missing devices may need manual intervention.

Integration with Choreography

The maintenance system integrates with the choreography runtime for threshold approval ceremonies. Snapshot and upgrade proposals are published to the journal where choreography protocols can coordinate approval.

The maintenance service publishes events as facts. Choreography protocols subscribe to these facts and coordinate the necessary approvals through their own message flows.

Summary

The Maintenance and OTA system provides coordinated maintenance operations with threshold approval and epoch fencing where those mechanisms actually exist. Snapshots enable garbage collection with writer fencing. OTA release distribution is eventual, while activation is local or scope-bound. Cache invalidation is replicated through the journal for consistency.

Use snapshots regularly for garbage collection. Plan upgrades carefully with sufficient notice for hard forks. Test all upgrades in simulation. Monitor snapshot and upgrade cycles to ensure system health.

Implementation References

  • Maintenance Service: aura-sync/src/services/maintenance.rs
  • Snapshot Protocol: aura-sync/src/protocols/snapshots.rs
  • OTA Protocol: aura-sync/src/protocols/ota.rs
  • Cache Management: aura-sync/src/infrastructure/cache_manager.rs
  • Integration Examples: aura-agent/src/handlers/maintenance.rs

Capability Vocabulary Inventory

This Phase 0 artifact inventories the current authorization capability strings, classifies their status, and records the canonical migration targets for the clean-cutover capability-vocabulary refactor.

Scope

This inventory covers authorization capability names used in:

  • product Rust call sites
  • Biscuit token issuance
  • guard snapshots and guard checks
  • choreography .tell files
  • docs and examples that currently teach or exercise capability annotations
  • test fixtures that still exercise legacy naming

This inventory does not treat the following as authorization capability names:

  • aura_core::effects::CapabilityKey runtime-admission keys
  • version-handshake feature flags such as ceremony_supersession
  • explicit negative-test placeholders such as unknown_capability
  • documentation placeholders such as capability_name
  • diagnostic labels such as bundle linking and session delegation
  • unrelated "capability" terminology in ownership or crypto docs

No out-of-tree module manifests currently exist in-tree. The module namespace rules below therefore reserve the future extension path, but no concrete module capabilities are currently admitted.

Reserved Namespaces

First-Party Authorization Namespaces

These namespace roots are reserved for first-party Aura capability families:

NamespaceOwner crateNotes
ampaura-ampAMP message-flow capabilities
authaura-authenticationAuthentication and guardian-auth capabilities
chataura-chatChat and channel message capabilities
consensusaura-consensusConsensus ceremony capabilities
dkdaura-authenticationDistributed key derivation choreography capabilities
examplehost-owned docs/examples namespaceReserved for teaching examples and macro tests
invitationaura-invitationInvitation, guardian, channel, and device flows
recoveryaura-recoveryRecovery, guardian setup, membership change
relayaura-rendezvousRelay-forward subfamily
rendezvousaura-rendezvousDescriptor and rendezvous exchange
syncaura-syncAnti-entropy and epoch rotation

Generic Host-Owned Capabilities

These names stay reserved by the host and are not owned by a feature crate:

  • read
  • write
  • execute
  • delegate
  • moderator
  • flow_charge

Reserved Module Namespace

Out-of-tree module-defined capabilities must use:

module:<module_id>:<capability_path>

Rules:

  • <module_id> is the admitted host-reviewed module identity, not an arbitrary author-chosen prefix.
  • <capability_path> uses the same validated lower-case segment grammar as first-party names.
  • modules may not claim first-party namespace roots such as invitation, consensus, or sync
  • modules may not claim generic host-owned names such as read or write
  • host runtime code must consume admitted descriptors, not hand-written module:<module_id>:... strings

Canonical First-Party Capability Inventory

These are the canonical migration targets for first-party product code.

Canonical nameOwner crateCurrent sourcesNotes
amp:sendaura-ampcrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rs, crates/aura-simulator/tests/guarded_amp_anti_entropy.rsCanonical AMP send capability
amp:receiveaura-ampcrates/aura-amp/src/choreography.tell currently uses cap:amp_recvCanonical AMP receive capability
auth:requestaura-authenticationcrates/aura-authentication/src/guards.rsCanonical authentication request capability
auth:submit_proofaura-authenticationcrates/aura-authentication/src/guards.rsCanonical proof-submission capability
auth:verifyaura-authenticationcrates/aura-authentication/src/guards.rsCanonical proof-verification capability
auth:create_sessionaura-authenticationcrates/aura-authentication/src/guards.rsAuthentication-owned session creation capability
auth:guardian:request_approvalaura-authenticationcrates/aura-authentication/src/guardian_auth_relational.tell, crates/aura-authentication/src/guards.rsCanonical guardian-auth request capability
auth:guardian:coordinateaura-authenticationcrates/aura-authentication/src/guardian_auth_relational.tellCoordinator-side guardian-auth capability
auth:guardian:submit_proofaura-authenticationcrates/aura-authentication/src/guardian_auth_relational.tellGuardian proof submission
auth:guardian:verifyaura-authenticationcrates/aura-authentication/src/guardian_auth_relational.tell, crates/aura-authentication/src/guards.rsCanonical guardian-auth verification capability
chat:channel:createaura-chatcrates/aura-chat/src/guards.rsCanonical chat channel-create capability
chat:message:sendaura-chatcrates/aura-chat/src/guards.rsCanonical chat send capability
consensus:initiateaura-consensuscrates/aura-consensus/src/protocol/guards.rsCanonical start-of-ceremony capability
consensus:witness_nonceaura-consensuscrates/aura-consensus/src/protocol/guards.rsWitness nonce submission
consensus:aggregate_noncesaura-consensuscrates/aura-consensus/src/protocol/guards.rsCoordinator aggregation capability
consensus:witness_signaura-consensuscrates/aura-consensus/src/protocol/guards.rsWitness sign-share submission
consensus:finalizeaura-consensuscrates/aura-consensus/src/protocol/guards.rsFinal consensus completion capability
dkd:initiateaura-authenticationcrates/aura-authentication/src/dkd.tellDKD initiation
dkd:commitaura-authenticationcrates/aura-authentication/src/dkd.tellDKD commitment
dkd:revealaura-authenticationcrates/aura-authentication/src/dkd.tellDKD reveal
dkd:finalizeaura-authenticationcrates/aura-authentication/src/dkd.tellDKD finalize
invitation:sendaura-invitationcrates/aura-invitation/src/guards.rs, crates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.invitation_exchange.tell, token issuanceCanonical invitation send capability
invitation:acceptaura-invitationcrates/aura-invitation/src/guards.rs, crates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.invitation_exchange.tell, token issuanceCanonical invitation accept capability
invitation:declineaura-invitationcrates/aura-invitation/src/guards.rs, crates/aura-invitation/src/protocol.rs, token issuanceCanonical invitation decline capability
invitation:cancelaura-invitationcrates/aura-invitation/src/guards.rs, token issuanceCanonical invitation cancel capability
invitation:guardianaura-invitationcrates/aura-invitation/src/guards.rs, crates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.guardian_invitation.tell, token issuanceGuardian invitation send capability
invitation:guardian:acceptaura-invitationcrates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.guardian_invitation.tellGuardian invitation accept capability
invitation:channelaura-invitationcrates/aura-invitation/src/guards.rs, token issuanceShared-channel invitation capability
invitation:device:enrollaura-invitationcrates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.device_enrollment.tellDevice-enrollment send capability
invitation:device:acceptaura-invitationcrates/aura-invitation/src/protocol.rs, crates/aura-invitation/src/protocol.device_enrollment.tellDevice-enrollment accept capability
recovery:initiateaura-recoverycrates/aura-authentication/src/guards.rs, crates/aura-agent/src/handlers/recovery.rs, crates/aura-recovery/src/recovery_protocol.tellRecovery initiation
recovery:coordinateaura-recoverycrates/aura-recovery/src/recovery_protocol.tellRecovery coordination capability
recovery:approveaura-recoverycrates/aura-authentication/src/guards.rs, crates/aura-agent/src/handlers/recovery.rs, crates/aura-recovery/src/recovery_protocol.tellGuardian approval capability
recovery:finalizeaura-recoverycrates/aura-agent/src/handlers/recovery.rs, crates/aura-recovery/src/recovery_protocol.tellCanonical completion/finalization capability
recovery:cancelaura-recoverycrates/aura-agent/src/handlers/recovery.rsRecovery cancellation capability
recovery:guardian_setup:initiateaura-recoverycrates/aura-recovery/src/guardian_setup.tellGuardian setup initiation
recovery:guardian_setup:accept_invitationaura-recoverycrates/aura-recovery/src/guardian_setup.tellGuardian setup invitation acceptance
recovery:guardian_setup:verify_invitationaura-recoverycrates/aura-recovery/src/guardian_setup.tellGuardian setup verification
recovery:guardian_setup:completeaura-recoverycrates/aura-recovery/src/guardian_setup.tellGuardian setup completion
recovery:membership_change:initiateaura-recoverycrates/aura-recovery/src/guardian_membership.tellMembership-change initiation
recovery:membership_change:voteaura-recoverycrates/aura-recovery/src/guardian_membership.tellGuardian vote capability
recovery:membership_change:verify_proposalaura-recoverycrates/aura-recovery/src/guardian_membership.tellProposal verification
recovery:membership_change:completeaura-recoverycrates/aura-recovery/src/guardian_membership.tellMembership-change completion
relay:forwardaura-rendezvouscrates/aura-rendezvous/src/protocol.rs, crates/aura-rendezvous/src/protocol.relayed_rendezvous.tell, docs/113_rendezvous.mdRelay forwarding subfamily
rendezvous:publishaura-rendezvouscrates/aura-rendezvous/src/protocol.rs, crates/aura-rendezvous/src/protocol.rendezvous_exchange.tell, crates/aura-agent/src/handlers/rendezvous.rs, crates/aura-agent/src/runtime/services/rendezvous_manager.rs, token issuance, docs/113_rendezvous.mdCanonical descriptor publish capability
rendezvous:connectaura-rendezvouscrates/aura-rendezvous/src/protocol.rs, crates/aura-rendezvous/src/protocol.rendezvous_exchange.tell, crates/aura-agent/src/handlers/rendezvous.rs, crates/aura-agent/src/runtime/services/rendezvous_manager.rs, docs/113_rendezvous.mdCanonical direct connect capability
rendezvous:relayaura-rendezvouscrates/aura-rendezvous/src/protocol.rs, crates/aura-rendezvous/src/protocol.relayed_rendezvous.tell, crates/aura-agent/src/handlers/rendezvous.rs, docs/113_rendezvous.mdCanonical relayed connect capability
sync:request_digestaura-synccrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rsAnti-entropy digest request capability
sync:request_opsaura-synccrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rsAnti-entropy op request capability
sync:push_opsaura-synccrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rsAnti-entropy batch push capability
sync:announce_opaura-synccrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rsAnti-entropy announcement capability
sync:push_opaura-synccrates/aura-authorization/src/biscuit_token.rs, crates/aura-agent/src/runtime/effects.rsAnti-entropy single-op push capability
sync:epoch:propose_rotationaura-synccrates/aura-sync/src/protocols/epochs.tellEpoch rotation proposal
sync:epoch:confirm_readinessaura-synccrates/aura-sync/src/protocols/epochs.tellEpoch rotation readiness confirmation
sync:epoch:commit_rotationaura-synccrates/aura-sync/src/protocols/epochs.tellEpoch rotation commit

Legacy Aliases and Invalid Drift

These strings are present today but are not approved as long-lived capability surface. They exist only as migration or deletion targets.

Current stringClassificationCanonical targetCurrent sourcesDisposition
amp:send and cap:amp_send coexistlegacy split-brain namingamp:sendcrates/aura-simulator/tests/guarded_amp_anti_entropy.rs, crates/aura-amp/src/choreography.tell, token issuanceKeep amp:send; delete cap:amp_send
cap:amp_recvlegacy aliasamp:receivecrates/aura-amp/src/choreography.tellDelete alias during Phase 4
auth:request_guardianlegacy aliasauth:guardian:request_approvalcrates/aura-authentication/src/guards.rsRename in typed family
auth:approve_guardianlegacy aliasauth:guardian:verifycrates/aura-authentication/src/guards.rsRename in typed family
auth:authenticateinvalid driftauth:verify or a new explicit auth:status if the owner decides status needs its own capabilitycrates/aura-agent/src/handlers/auth.rsPhase 2/5 owner decision, then delete drift
initiate_consensuslegacy choreography aliasconsensus:initiatecrates/aura-consensus/src/protocol/choreography.tell, crates/aura-consensus/src/protocol/guards.rsTemporary parse bridge only if needed in Phase 4
witness_noncelegacy choreography aliasconsensus:witness_noncesame as aboveTemporary parse bridge only if needed in Phase 4
aggregate_nonceslegacy choreography aliasconsensus:aggregate_noncessame as aboveTemporary parse bridge only if needed in Phase 4
witness_signlegacy choreography aliasconsensus:witness_signsame as aboveTemporary parse bridge only if needed in Phase 4
finalize_consensuslegacy choreography aliasconsensus:finalizesame as aboveTemporary parse bridge only if needed in Phase 4
invitation:devicelegacy umbrella namesplit to invitation:device:enroll and invitation:device:acceptcrates/aura-invitation/src/guards.rs, token issuanceRemove umbrella capability
message:sendlegacy unowned namespacechat:message:sendtoken issuance, crates/aura-agent/src/runtime/effects.rs, docs/tests in aura-guards, aura-mpst, aura-macrosMigrate examples/tests or move to example:*; product code uses chat:*
rendezvous:publish_descriptorinvalid driftrendezvous:publishcrates/aura-agent/src/handlers/rendezvous.rsDelete drift
rendezvous:initiate_channelinvalid driftrendezvous:connectcrates/aura-agent/src/handlers/rendezvous.rsDelete drift
rendezvous:relay_requestinvalid driftrendezvous:relaycrates/aura-agent/src/handlers/rendezvous.rsDelete drift
recovery:completelegacy aliasrecovery:finalizecrates/aura-agent/src/handlers/recovery.rsRename to finalized vocabulary
accept_guardian_invitation,verify_setup_invitationinvalid composite choreography stringsplit to recovery:guardian_setup:accept_invitation and recovery:guardian_setup:verify_invitationcrates/aura-recovery/src/guardian_setup.tellDelete comma-joined string syntax on this path
vote_membership_change,verify_membership_proposalinvalid composite choreography stringsplit to recovery:membership_change:vote and recovery:membership_change:verify_proposalcrates/aura-recovery/src/guardian_membership.tellDelete comma-joined string syntax on this path
sync:readinvalid umbrella namereplace with operation-specific sync:* capability per call sitecrates/aura-sync/src/infrastructure/peers.rsDelete umbrella capability
sync_journalinvalid legacy namereplace with operation-specific sync:* capability per call sitecrates/aura-sync/src/protocols/anti_entropy.rs, archived work notesDelete legacy name
recover:deviceinvalid drift in test payloadowner should replace with a canonical recovery:* capability or a typed role fieldcrates/aura-invitation/src/protocol.rs test serializationDo not preserve as compatibility alias
invitation:createinvalid test-only driftdelete or replace with a real invitation capabilitycrates/aura-core/src/ownership.rs test helperDo not preserve
recovery_initiatelegacy test fixture aliasrecovery:initiatecrates/aura-testkit/src/fixtures/biscuit.rsDelete alias in fixture
recovery_approvelegacy test fixture aliasrecovery:approvecrates/aura-testkit/src/fixtures/biscuit.rsDelete alias in fixture
threshold_signinvalid / unowned test fixture nameowner must replace with canonical family or remove fixture dependencycrates/aura-testkit/src/fixtures/biscuit.rsDelete or replace

Choreography and Example Names That Must Become Namespaced

These current names are intentionally not approved as canonical product capabilities. They either move into an owned first-party namespace or into the reserved host-owned example:* namespace for teaching material.

Current string(s)ClassificationCanonical target
send_ping, send_pong, send_request, send_response, send_message, send, coordinate, coordinate_signing, participate_signingdocs/examples legacy placeholdersexample:* names in docs, examples, macro tests, and MPST tests
create_session, join_session, decline_session, activate_session, broadcast_message, check_status, report_status, end_sessionexample-only session choreography namesexample:* names unless the session protocol becomes a real first-party family
request_session, invite_participants, respond_session, create_session, notify_participants, reject_session_creation, notify_participants_failureinvalid unnamespaced internal choreography namesfuture owned session:* family if retained; otherwise delete
request_guardian_approval, coordinate_guardians, submit_guardian_proof, verify_guardianlegacy unnamespaced auth choreography namesauth:guardian:* family
initiate_recovery, approve_recovery, coordinate_recovery, finalize_recovery, initiate_guardian_setup, accept_guardian_invitation, verify_setup_invitation, complete_guardian_setup, initiate_membership_change, vote_membership_change, verify_membership_proposal, complete_membership_changelegacy unnamespaced recovery choreography namesrecovery:* subfamilies
propose_epoch_rotation, confirm_epoch_readiness, commit_epoch_rotationlegacy unnamespaced sync choreography namessync:epoch:* family

Explicit Audit Exclusions

These strings were caught by broad Phase 0 grep passes but are not part of the authorization capability vocabulary:

StringReason for exclusionCurrent sources
ceremony_supersessionversion-handshake feature flag, not an authorization capabilitycrates/aura-protocol/src/handlers/version_handshake.rs, crates/aura-core/src/protocol/versions.rs
fact_journalversion-handshake feature flag, not an authorization capabilitysame as above plus docs
unknown_capabilitynegative-test placeholder for version capability queriescrates/aura-protocol/src/handlers/version_handshake.rs, crates/aura-core/src/protocol/versions.rs
capability_namedocumentation placeholder in MPST docscrates/aura-mpst/src/lib.rs
bundle linkingdiagnostic label passed to a reconfiguration capability check, not a capability namecrates/aura-agent/src/runtime/services/reconfiguration_manager.rs
session delegationdiagnostic label passed to a reconfiguration capability check, not a capability namesame as above

Quarantine Notes

  • Historical scratch notes remain explicitly quarantined as non-authoritative archive material.
  • This file replaces ad hoc capability-name scratch lists for the Phase 0 refactor inventory.
  • Remaining legacy names are recorded here only as migration/deletion targets. They are not approved compatibility surfaces.

Audit Commands

Phase 0 inventory data was gathered with:

rg -n --no-heading 'CAP_[A-Z0-9_]+: &str = "[^"]+"' crates -g'*.rs'
rg -n --no-heading 'CapabilityId::from\("[^"]+"|has_capability\("[^"]+"' crates -g'*.rs'
rg -n --no-heading 'guard_capability = "[^"]+"|#\[guard_capability\("[^"]+"\)\]' crates docs examples -g'*.rs' -g'*.md' -g'*.tell'
rg -n --no-heading 'capability\("[^"]+"\)' crates docs examples -g'*.rs' -g'*.md'

Environment Configuration Guide

This guide is the working registry for Aura environment variables.

Use three buckets:

  • product runtime: user-facing runtime configuration that affects production shells or handlers
  • harness / tooling: local automation, browser harness, CI, or workflow bring-up knobs
  • test-only: fixture generation, compatibility baselines, compile-fail helpers, or artifact capture used only in tests

Do not add new env reads ad hoc. Add the variable to the appropriate bucket first, then expose it through a small typed helper near the owning runtime boundary.

Product Runtime

VariableOwnerPurpose
AURA_PATHaura-agent::core::configBase root used to resolve the default Aura storage directory
AURA_BOOTSTRAP_BROKER_BINDaura-terminal::envOverride bootstrap-broker bind address for mixed native/browser startup
AURA_BOOTSTRAP_BROKER_URLaura-terminal::envOverride externally reachable bootstrap-broker base URL
AURA_BOOTSTRAP_BROKER_ALLOW_LAN_BINDaura-terminal::envExplicitly allow a bootstrap broker bind address that is visible off loopback
AURA_BOOTSTRAP_BROKER_AUTH_TOKENaura-terminal::envBearer token required for bootstrap-broker HTTP endpoints
AURA_BOOTSTRAP_BROKER_INVITATION_TOKENaura-terminal::envUnguessable one-time token sent as the bootstrap-broker invitation retrieval header
AURA_CLIPBOARD_MODEaura-terminal::envSelect clipboard behavior for terminal code display (system, file_only, disabled)
AURA_CLIPBOARD_FILEaura-terminal::envCapture clipboard writes to a file for constrained or automated environments
AURA_DEMO_DEVICE_IDaura-terminal::envEnable demo-only neighborhood assist behavior when a demo device id is staged
AURA_TCP_LISTEN_ADDRaura-effects::transport::envOverride the stateless TCP receive bind address
AURA_TUI_ALLOW_STDIOaura-terminal::envDisable fullscreen stdio redirection for debugging
AURA_TUI_LOG_PATHaura-terminal::envOverride the storage key/path used for persisted TUI logs

Harness / Tooling

Variable familyOwnerPurpose
AURA_HARNESS_*aura-harness, aura-app::workflows::runtime, aura-agent::runtime_bridge, aura-ui::app::shell::modal_submitHarness orchestration, convergence, browser bootstrap, and render-stability controls
AURA_WEB_APP_URLaura-harnessFallback browser app URL for harness/browser bring-up
AURA_ALLOW_FLOW_COVERAGE_SKIP, AURA_FLOW_COVERAGE_*aura-harness::governanceGovernance/coverage policy knobs
GITHUB_*, CIharness/tooling onlyCI-aware defaults for harness governance and parity rotation

Harness env reads may stay close to the owning harness boundary, but they should still be routed through small typed helpers when reused in multiple places.

Test-only

Examples:

  • AURA_PROTOCOL_COMPAT_*
  • AURA_CONSENSUS_ITF_*
  • AURA_CONFORMANCE_*
  • AURA_PROPERTY_MONITOR_*
  • AURA_TELLTALE_*
  • compile-fail helpers that only inspect Cargo-provided environment variables

These should not be mixed into product-runtime or harness-runtime registries. Keep them local to the owning test module unless they become shared fixture infrastructure.

User Flow Coverage Report

This document tracks end-to-end user coverage for Aura's runtime harness scenarios across TUI and web surfaces.

Coverage Boundary Statement

User flow coverage validates user-visible behavior and interaction wiring through runtime harness scenarios. It does not replace protocol conformance, theorem proofs, or differential parity lanes. Use this report for UI/product flow traceability and regression targeting.

The harness coverage model has two explicit lanes:

  • shared semantic lane:
    • parity-critical shared flows execute through the shared semantic command plane
    • this is the primary lane for debugging production workflows
  • frontend-conformance lane:
    • renderer-specific control wiring, DOM/PTy mechanics, and shell integration are validated separately
    • these scenarios are intentionally not the primary shared-flow substrate

Summary Metrics

MetricCount
Harness User Flow Scenarios16
Shared Semantic Scenarios11
Mixed-Runtime Scenarios (TUI + Web distinct keys)3
Frontend-Conformance Scenarios5
Core User Flow Domains13

Coverage Classes

Aura tracks three different coverage classes in this document:

ClassMeaningMain Artifact
Parity-critical shared flowOne semantic flow that must remain portable across TUI and web and execute through the shared semantic command planeaura-app::ui_contract + canonical harness scenarios
Mixed-runtime interoperabilityUser-visible flow that intentionally spans different frontend/runtime combinationsCanonical mixed-runtime scenarios
Frontend-specific or auxiliary coverageFocused smoke, modal, renderer-specific, or conformance-only coverage that is useful but not the parity contractSupplementary scenarios

This report is a traceability document for those classes. It is not a proof of protocol correctness, and it does not replace conformance or verification lanes.

Canonical UX Scenario Set

ScenarioFilePrimary Flow
Startup Smokescenarios/harness/real-runtime-mixed-startup-smoke.tomlShared runtime startup and onboarding readiness
TUI Global Navigation/Help Hotkeysscenarios/harness/tui-conformance-global-navigation-help-hotkeys.tomlTUI frontend-conformance: global navigation, key mappings, and help modal behavior
TUI Neighborhood Keypaths/Detailscenarios/harness/tui-conformance-neighborhood-keypaths-and-detail.tomlTUI frontend-conformance: neighborhood keypaths, rendered map/detail text, and toast wiring
Scenario 12scenarios/harness/scenario12-mixed-device-enrollment-removal-e2e.tomlMixed TUI/Web device enrollment + removal
Scenario 13scenarios/harness/scenario13-mixed-contact-channel-message-e2e.tomlShared contact invite + channel messaging
Contact Invite Notification Roundtripscenarios/harness/mixed-contact-invite-notification-roundtrip.tomlMixed TUI/Web symmetric contact invite acceptance notifications
Shared Settingsscenarios/harness/shared-settings-parity.tomlShared semantic settings parity
Shared Notifications/Authorityscenarios/harness/shared-notifications-and-authority.tomlShared semantic notifications navigation and authority-switch handling
AMP Normal Transitionscenarios/harness/amp-transition-normal-shared.tomlShared AMP observed to A2 live to A3 finalized transition observation
AMP Delayed Witness Transitionscenarios/harness/amp-transition-delayed-witness-shared.tomlShared AMP delayed/offline witness pending and convergence observation
AMP Conflict/Subtractive Transitionscenarios/harness/amp-transition-conflict-subtractive-shared.tomlShared AMP conflicting A2 certificate and subtractive membership observation
AMP Emergency Transitionscenarios/harness/amp-transition-emergency-shared.tomlShared AMP emergency quarantine, cryptoshred, and governance non-removal observation
AMP Negative Transitionscenarios/harness/amp-transition-negative-shared.tomlShared AMP rejected emergency, cooldown duplicate evidence, and recovery replay observation
Browser Observationscenarios/harness/semantic-observation-browser-smoke.tomlBrowser semantic observation contract smoke
TUI Observationscenarios/harness/semantic-observation-tui-smoke.tomlTUI semantic observation contract smoke
Quint Observationscenarios/harness/quint-semantic-observation-smoke.tomlQuint-origin semantic observation reference

The two TUI-only conformance scenarios plus Scenario 13 are retained as frontend-conformance coverage. All harness scenarios in this inventory now use the semantic scenario format.

User Flow Matrix

Flow DomainMain CoverageSecondary CoverageRuntime Context
Startup and onboarding readinessreal-runtime-mixed-startup-smoke.tomlquint-semantic-observation-smoke.tomlTUI + Web
Navigate neighborhoodreal-runtime-mixed-startup-smoke.tomlTUI Neighborhood Keypaths/DetailTUI + Web
Navigate chatScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Navigate contactsScenario 13mixed-contact-invite-notification-roundtrip.toml, semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Send friend requestScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Accept inbound friend requestScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Decline inbound friend requestScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Remove friend / revoke outbound friendshipScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Navigate notificationsshared-notifications-and-authority.tomlmixed-contact-invite-notification-roundtrip.toml, semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Navigate settingsshared-settings-parity.tomlshared-notifications-and-authority.toml, semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.toml, quint-semantic-observation-smoke.tomlTUI + Web
Create invitationScenario 13mixed-contact-invite-notification-roundtrip.toml, semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Accept invitationScenario 13mixed-contact-invite-notification-roundtrip.toml, semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Create homeScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Join channelScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Send chat messageScenario 13semantic-observation-browser-smoke.toml, semantic-observation-tui-smoke.tomlTUI + Web
Add deviceScenario 12shared-settings-parity.tomlMixed runtime
Remove deviceScenario 12shared-settings-parity.tomlMixed runtime
Switch authorityshared-notifications-and-authority.tomlshared-settings-parity.tomlTUI + Web
Global navigation/helpTUI Global Navigation/Help HotkeysNoneTUI frontend-conformance
Neighborhood keypath navigationTUI Neighborhood Keypaths/Detailreal-runtime-mixed-startup-smoke.tomlTUI frontend-conformance + shared startup
Semantic observation contractsemantic-observation-browser-smoke.tomlsemantic-observation-tui-smoke.toml, quint-semantic-observation-smoke.tomlBrowser + TUI
AMP transition frontend observationamp-transition-normal-shared.toml, amp-transition-delayed-witness-shared.toml, amp-transition-conflict-subtractive-shared.toml, amp-transition-emergency-shared.toml, amp-transition-negative-shared.tomlRuntime-event parity contract tests; simulator transition scenarios from Phase 9TUI + Web observation contract

Current parity-critical source changes touched the following shared-flow areas and continue to map to the same canonical coverage anchors:

  • Shared flow and scenario contract metadata now live behind facade roots in aura-app::ui_contract and aura-app::scenario_contract, with dedicated module families for parity metadata, harness metadata, shared-flow support, action contracts, expectations, submission, and value types. The canonical public contract and the coverage anchors below do not change with that internal split.
  • Notifications navigation remains anchored by shared-notifications-and-authority.toml, with semantic-observation-browser-smoke.toml and semantic-observation-tui-smoke.toml as secondary observation coverage. The current notifications-screen change is limited to neutral empty-state copy and detail text, and the coverage expectation remains that notifications navigation exercises shared semantic navigation only rather than frontend-specific invitation or recovery actions.
  • The TUI semantic-observation contract keeps the same canonical anchor in semantic-observation-tui-smoke.toml, but the native harness ingress now requires explicit harness mode, the per-run AURA_HARNESS_RUN_TOKEN, and transient-root-scoped AURA_TUI_COMMAND_SOCKET, AURA_TUI_UI_STATE_SOCKET, and AURA_TUI_UI_STATE_FILE values. That change hardens the existing observation scenario; it does not introduce a new shared-flow anchor or a production-only harness surface.
  • Neighborhood navigation stays anchored by real-runtime-mixed-startup-smoke.toml
  • Chat/contact navigation, the contact-to-friend lifecycle, invitation, home creation, channel join, and message-send flows stay anchored by scenario13-mixed-contact-channel-message-e2e.toml
  • Contacts navigation, invitation creation, and invitation acceptance remain mapped to Scenario 13 plus the semantic observation smoke scenarios. The current terminal-side change only removes stale modal-local ownership assumptions and keeps those flows on the same typed dispatch and shared workflow path.
  • Mixed contact-invite acceptance notifications now also have a dedicated symmetric mixed-runtime anchor in mixed-contact-invite-notification-roundtrip.toml. That scenario exercises TUI-create/Web-accept and Web-create/TUI-accept flows and verifies the creator-visible acceptance notification on the notifications screen without replacing Scenario 13 as the canonical contacts/chat lifecycle anchor.
  • Pending channel-invitation acceptance now also requires terminal-status wrappers and *_with_instance entry points to publish a terminal failure for the same operation instance if a browser/TUI shared-flow error escapes before the owned accept path settles. That keeps Scenario 13 authoritative for contacts navigation, invitation create/accept, channel join, and shared-channel receive parity rather than leaving the lifecycle stranded at SemanticOperationPhase::WorkflowDispatched.
  • aura-app splits these same flows across more specific owner modules while preserving the coverage anchors above: workflows/context/neighborhood.rs, workflows/invitation/{create,accept,readiness}.rs, and workflows/messaging/{channel_refs,channels,send}.rs. Shared-flow source metadata continues to publish through the aura-app::ui_contract facade.
  • Settings/device and notifications/authority work keep the same canonical anchors. Device add/remove and shared settings remain bound to scenario12-mixed-device-enrollment-removal-e2e.toml plus shared-settings-parity.toml, while notifications navigation and authority switching remain bound to shared-notifications-and-authority.toml.
  • Scenario 12 now also carries the browser-specific removable-device parity rule: the shared semantic snapshot exports current-device markers without fabricating a selected device row, so mixed browser runs must resolve remove_selected_device from the authoritative removable device in settings state when no explicit ListId::Devices selection exists.
  • Scenario 13 now also carries the mixed-runtime sealed-message receive rule: the current TUI/browser shared-channel receive path may converge on sealed authoritative placeholders, so the canonical receive assertions for the mixed-runtime anchor match the [sealed: prefix instead of renderer-local plaintext recovery.
  • AMP channel transition frontend observation is covered by shared semantic AMP transition scenarios plus runtime-event parity contract tests. TUI and web consume RuntimeFact::AmpChannelTransitionUpdated through UiSnapshot.runtime_events and shared aura-app::ui_contract action/control ids; the shared scenarios drive typed AMP transition fixtures through the semantic command plane and assert RuntimeEventKind::AmpChannelTransitionUpdated without frontend-specific compatibility steps.
  • Browser bootstrap broker credential handling remains frontend-conformance coverage. Bearer and invitation retrieval tokens must come from session-scoped browser storage rather than URL query parameters, while the broker endpoint may still be staged through controlled bootstrap metadata. The native storage boundary test and browser observation smoke coverage protect the bridge/storage contract without creating a new shared semantic user flow.

Scenario 13 remains the canonical anchor for the shared contacts lifecycle because it exercises the parity-critical semantic controls for send friend request, accept friend request, decline friend request, and remove friend while preserving the runtime-projected relationship states contact, pending_outbound, pending_inbound, and friend across both TUI and web.

Frontend-Conformance Coverage

These scenarios are maintained outside the main shared semantic lane:

Scenario FileFocus
tui-conformance-global-navigation-help-hotkeys.tomlTUI hotkeys, global navigation, help modal wiring
tui-conformance-neighborhood-keypaths-and-detail.tomlTUI neighborhood keypaths, rendered map/detail text, toast wiring
semantic-observation-browser-smoke.tomlBrowser observation contract smoke
semantic-observation-tui-smoke.tomlTUI observation contract smoke
quint-semantic-observation-smoke.tomlReference semantic observation smoke

Planned Release And Update Validation Matrix

This matrix is the planned coverage target for harness-driven validation of module and OTA distribution/update behavior.

The intended rollout order is:

  1. mechanism validation matrix
  2. candidate-release rehearsal matrix
  3. live-release promotion gates

Mechanism validation proves the lifecycle machinery itself under synthetic releases and controlled failures. Candidate-release rehearsal proves that one specific release works before promotion or real cutover.

OrderDomainModeTargetExample Coverage GoalStatus
1OTAMechanism validationNative host shell + runtime payloadSynthetic release publication, staging, bootloader handoff, health confirmation, rollbackPlanned
2OTAMechanism validationBrowser-extension host shell + runtime payloadSynthetic extension-target OTA staging, compatibility block, handoff, recoveryPlanned
3OTAMechanism validationMobile host shell + runtime payloadSynthetic mobile-target staging, blocked activation when host shell is too oldPlanned
4ModuleMechanism validationGeneric module lifecycleSynthetic module discovery, verification, staging, admission, cutover, rollbackPlanned
5ModuleMechanism validationCross-host artifact availabilityNon-executing hosts serve artifacts to execution-compatible hostsPlanned
6OTACandidate-release rehearsalSpecific Aura runtime releaseCandidate runtime release staged and rehearsed before promotion/cutoverPlanned
7ModuleCandidate-release rehearsalSpecific module releaseCandidate module release staged and rehearsed before promotion/cutoverPlanned
8Example moduleCandidate-release rehearsalBrowser-extension hostCandidate Example module release exercised end to end before promotionPlanned
9OTAPromotion gateRelease operationReal cutover remains blocked until rehearsal passesPlanned
10ModulePromotion gateRelease operationReal publication/activation remains blocked until rehearsal passesPlanned

This matrix should remain typed-lifecycle driven. It should not be satisfied by log scraping or ad hoc manual release notes.

Typed Release Validation Contract

Each planned release/update row above must map to a typed harness contract before it is counted as implemented.

Coverage EntryTyped command/control surfaceTyped lifecycle evidencePrimary lane
OTA mechanism validationPublishSyntheticOtaRelease, StageOtaCandidate, TriggerBootloaderHandoff, ConfirmCandidateHealth, RollbackOtaCandidateOtaReleasePublished, OtaArtifactAvailable, OtaStaged, OtaCompatibilityBlocked, OtaCandidateLaunched, OtaHealthConfirmed, OtaRolledBackShared semantic lane
OTA candidate-release rehearsalPublishCandidateOtaRelease, StageOtaCandidate, ApproveOtaCutover, ConfirmCandidateHealth, RollbackOtaCandidateOtaCandidatePublished, OtaStaged, OtaPromotionStateChanged, OtaCandidateLaunched, OtaHealthConfirmed, OtaRehearsalPassedShared semantic lane
Module mechanism validationPublishSyntheticModuleRelease, StageModuleCandidate, PrepareModuleAdmission, CommitModuleCutover, RollbackModuleCutoverModuleReleasePublished, ModuleArtifactAvailable, ModuleVerified, ModuleStaged, ModuleAdmissionPrepared, ModuleCutoverCommitted, ModuleRolledBackShared semantic lane
Module candidate-release rehearsalPublishCandidateModuleRelease, StageModuleCandidate, ApproveModuleCutover, CommitModuleCutover, RollbackModuleCutoverModuleCandidatePublished, ModuleStaged, ModulePromotionStateChanged, ModuleCutoverCommitted, ModuleHealthConfirmed, ModuleRehearsalPassedShared semantic lane

These rows are intentionally semantic-lane requirements. Frontend-conformance coverage may validate renderer wiring for release screens or controls, but it does not satisfy release/update lifecycle coverage on its own.

Coverage Expectations

Shared Flow Contract Expectations

Every parity-critical shared flow should have, in code and metadata:

  • a canonical shared flow identifier in aura-app::ui_contract
  • a typed semantic command path on both TUI and web
  • semantic action contracts with preconditions and terminal success/failure conditions
  • an authoritative readiness, event, or quiescence owner for waits
  • any parity exception recorded as typed metadata in aura-app::ui_contract with a reason code, scope, affected surface, and authoritative doc reference
  • at least one canonical scenario reference in this report

Shared-flow scenarios must not rely on raw PTY keys, raw selector clicks, raw label matching, or incidental focus-stepping as their primary mechanics. Those behaviors belong in frontend-conformance coverage instead.

Frontend-specific flows may still have scenario coverage, but they are not part of the portability contract unless explicitly promoted into the shared contract.

PR Gate Expectations

  1. Changes to global navigation, settings, chat, contacts, neighborhood, or ceremonies should have at least one impacted canonical scenario updated or re-validated.
  2. Changes that affect both TUI and web behavior should be validated against parity-critical scenarios in the shared semantic lane on both runtimes.
  3. Changes to mixed-instance behavior should include scenario 12 and/or 13 coverage.
  4. Contacts-surface changes that alter relationship state or action availability must preserve the shared semantic lifecycle for contact, pending_outbound, pending_inbound, and friend, and they must stay anchored to Scenario 13 rather than shell-specific smoke coverage.
  5. Mixed-runtime code exchange and chat routing changes should preserve the event-driven contract used by scenarios 12 and 13: invitation/device codes come from typed runtime-event payloads, and chat assertions bind to the selected shared channel rather than frontend-specific ordering.
  6. Browser shared-flow bridge changes should preserve the explicit runtime identity staging handoff and the page-owned semantic command queue used by the Playwright lane. Coverage remains anchored in the shared semantic scenarios rather than DOM-driving fallback mechanics.
  7. Changes to renderer-specific control wiring should add or update frontend-conformance coverage rather than weakening the shared semantic lane.
  8. Browser bootstrap broker credential changes must preserve the rule that bearer and invitation retrieval tokens are not URL parameters, and should update frontend-conformance tests or documentation when the storage boundary changes.
  9. Changes to OTA or module release/update architecture should update the planned release/update matrix above when they add, remove, or reorder mechanism-validation or release-rehearsal coverage.

CI Enforcement

Fast CI currently uses two separate gates:

  • just ci-user-flow-coverage enforces traceability heuristics between changed user flow-facing source files, canonical scenarios, and this report
  • AURA_ALLOW_FLOW_COVERAGE_SKIP=1 is a local-only escape hatch. CI rejects it.
  • just ci-user-flow-policy enforces documentation and contributor-guidance updates for shared user flow contract and determinism surfaces via Aura's toolkit/xtask user-flow guidance sync check
  • OTA and module release/update validation rows in this report are part of that same user-flow guidance surface and must be kept in sync as the release matrix evolves
  • The release/update rows are expected to land in staged order: mechanism validation first, candidate rehearsal second, and promotion-gate coverage last
  • just ci-harness-matrix-inventory enforces that scenario classification drives the TUI/web matrix lanes
  • shared semantic scenarios and frontend-conformance scenarios are expected to remain distinct classifications. CI policy should reject shared-flow drift back to renderer-driven mechanics.

Current limitation:

  • ci-user-flow-coverage still infers some ownership from filenames and does not yet prove that the correct scenario set changed
  • docs updates and coverage traceability are distinct concerns. This report should not claim stronger behavioral enforcement than CI actually provides.

Residual Risk Areas

AreaCurrent RiskMitigation Direction
Long-tail modal sequencingMediumAdd focused scenario fragments for rare wizard branch paths
Toast timing/race windowsMediumPrefer persistent-state assertions over toast-only checks
Cross-topology regressionsMediumKeep mixed-topology smoke scenarios in scheduled lanes

References

Verification Coverage Report

This document provides an overview of the formal verification, model checking, and conformance testing infrastructure in Aura.

Verification Boundary Statement

Aura keeps consensus and CRDT domain proof ownership in Quint models and Lean theorems. Telltale parity lanes validate runtime conformance behavior from replay artifacts. Telltale parity success does not count as new domain theorem coverage. See Formal Verification Reference for the assurance classification and limits.

Summary Metrics

MetricCount
Quint Specifications42
Quint Invariants191
Quint Temporal Properties11
Quint Type Definitions362
Lean Source Files38
Lean Theorems118
Conformance Fixtures4
ITF Trace Harnesses9
Testkit Tests118
Bridge Modules4
CI Verification Gates11
Telltale Parity Modules1
Bridge Pipeline Fixtures3

Verification Layers

Layer 1: Quint Specifications

Formal protocol specifications in verification/quint/ organized by subsystem.

SubsystemFilesContents
Root11core, authorization, recovery, invitation, interaction, leakage, sbb, time_system, transport, epochs, cli_recovery_demo
consensus/4core, liveness, adversary, frost
journal/3core, counter, anti_entropy
keys/3dkg, dkd, resharing
sessions/4core, choreography, groups, locking
amp/1channel
liveness/3connectivity, timing, properties
harness/9amp_channel, counter, dkg, flows, groups, locking, recovery, resharing, semantic_observation_smoke
tui/4demo_recovery, flows, signals, state

Harness modules generate ITF traces on demand via just quint-generate-traces. Traces are not checked into the repository. CI runs just ci-conformance-itf to generate traces and replay them through Rust handlers.

Key Specifications

SpecificationPurposeKey Properties
consensus/core.qntFast-path consensus protocolUniqueCommitPerInstance, CommitRequiresThreshold, EquivocatorsExcluded
consensus/liveness.qntProgress guaranteesProgressUnderSynchrony, RetryBound, CommitRequiresHonestParticipation
consensus/adversary.qntByzantine toleranceByzantineThreshold, EquivocationDetected, HonestMajorityCanCommit
consensus/frost.qntThreshold signaturesShare aggregation, commitment validation
journal/core.qntCRDT journal semanticsNonceUnique, FactsOrdered, NonceMergeCommutative, LamportMonotonic
journal/anti_entropy.qntSync protocolFactsMonotonic, EventualConvergence, VectorClockConsistent
authorization.qntGuard chain securityNoCapabilityWidening, ChargeBeforeSend
time_system.qntTimestamp orderingTimeStamp domain semantics and comparison

Layer 2: Rust Integration

Files implementing Quint-Rust correspondence and model-based testing.

Core Integration (aura-core)

  • effects/quint.rs - QuintMappable trait for bidirectional type mapping
  • effects/mod.rs - Effect trait definitions with Quint correspondence

Quint Crate (aura-quint)

  • runner.rs - QuintRunner with property caching and verification statistics
  • properties.rs - PropertySpec, PropertySuite, and property categorization
  • evaluator.rs - QuintEvaluator subprocess wrapper for Quint CLI
  • handler.rs - Effect handler integration

Telltale Bridge (aura-quint)

Cross-validation modules for Lean↔Quint correspondence:

ModulePurpose
bridge_export.rsExport Quint state to Lean-readable format
bridge_import.rsImport Lean outputs back to Quint structures
bridge_format.rsShared serialization format definitions
bridge_validate.rsCross-validation assertions and checks

Simulator Integration (aura-simulator/src/quint/)

17 modules implementing generative simulation:

ModulePurpose
action_registry.rsMaps Quint action names to Rust handlers
state_mapper.rsBidirectional state conversion (Rust <-> Quint JSON)
generative_simulator.rsOrchestrates ITF trace replay with property checking
itf_loader.rsParses ITF traces from Quint model checking
itf_fuzzer.rsModel-based fuzzing with coverage analysis
trace_converter.rsConverts between trace formats
simulation_evaluator.rsEvaluates properties during simulation
properties.rsProperty extraction and classification
domain_handlers.rsDomain-specific action handlers
amp_channel_handlers.rsAMP reliable channel handlers
byzantine_mapper.rsByzantine fault strategy mapping
chaos_generator.rsChaos/fault scenario generation
aura_state_extractors.rsAura-specific state extraction
cli_runner.rsCLI integration for Quint verification
ast_parser.rsQuint AST parsing for analysis
mod.rsModule exports and re-exports
types.rsShared type definitions

Differential Verification (aura-simulator)

  • differential_tester.rs - Cross-implementation parity testing between Quint models and Rust handlers
  • telltale_parity.rs - Telltale-backed parity boundary, canonical surface mapping, and report artifact generation

Consensus Verification (aura-consensus)

  • core/verification/mod.rs - Verification module facade
  • core/verification/quint_mapping.rs - Consensus type mappings

Layer 3: Lean Proofs

Lean 4 mathematical proofs in verification/lean/ providing formal guarantees.

Type Modules (10 files)

ModuleContent
Types/ByteArray32.lean32-byte hash representation (6 theorems)
Types/OrderTime.leanOpaque ordering tokens (4 theorems)
Types/TimeStamp.lean4-variant time enum
Types/FactContent.leanStructured fact types
Types/ProtocolFacts.leanProtocol-specific fact types
Types/AttestedOp.leanAttested operation types
Types/TreeOp.leanTree operation types
Types/Namespace.leanAuthority/Context namespaces
Types/Identifiers.leanIdentifier types
Types.leanType module aggregation

Domain Modules (9 files)

ModulePurpose
Domain/Consensus/Types.leanConsensus message types (8 definitions)
Domain/Consensus/Frost.leanFROST signature types
Domain/Journal/Types.leanFact and Journal structures
Domain/Journal/Operations.leanmerge, reduce, factsEquiv (1 theorem)
Domain/FlowBudget.leanBudget types and charging
Domain/GuardChain.leanGuard types and evaluation
Domain/TimeSystem.leanTimestamp comparison
Domain/KeyDerivation.leanKey derivation types
Domain/ContextIsolation.leanContext isolation model

Proof Modules (14 files, 118 theorems)

Consensus Proofs

ModuleTheoremsContent
Proofs/Consensus/Agreement.lean3No two honest parties commit different values
Proofs/Consensus/Validity.lean7Only valid proposals can be committed
Proofs/Consensus/Equivocation.lean5Detection soundness and completeness
Proofs/Consensus/Evidence.lean8Evidence CRDT semilattice properties
Proofs/Consensus/Frost.lean12FROST share aggregation safety
Proofs/Consensus/Liveness.lean3Progress under timing assumptions (axioms)
Proofs/Consensus/Adversary.lean7Byzantine model bounds
Proofs/Consensus/Summary.lean-Master consensus claims bundle

Infrastructure Proofs

ModuleTheoremsContent
Proofs/Journal.lean14CRDT semilattice (commutativity, associativity, idempotence)
Proofs/FlowBudget.lean5Charging correctness
Proofs/GuardChain.lean7Guard evaluation determinism
Proofs/TimeSystem.lean8Timestamp ordering properties
Proofs/KeyDerivation.lean3PRF isolation proofs
Proofs/ContextIsolation.lean16Context separation and bridge authorization

Entry Points (4 files)

  • Aura.lean - Top-level documentation
  • Aura/Proofs.lean - Main reviewer entry with all Claims bundles
  • Aura/Assumptions.lean - Cryptographic axioms (FROST unforgeability, hash collision resistance, PRF security)
  • Aura/Runner.lean - CLI for differential testing

Layer 4: Conformance Testing

Deterministic parity validation infrastructure in aura-testkit.

Conformance Fixtures

FixturePurpose
consensus.jsonConsensus protocol conformance
sync.jsonSynchronization protocol conformance
recovery.jsonGuardian recovery conformance
invitation.jsonInvitation protocol conformance

Conformance Modules

ModulePurpose
conformance.rsArtifact loading, replay, and verification
conformance_diff.rsLaw-aware comparison with envelope classifications

Effect Envelope Classifications

ClassEffect KindsComparison Rule
stricthandle_recv, handle_choose, handle_acquire, handle_releaseByte-exact match required
commutativesend_decision, invoke_stepOrder-insensitive under normalization
algebraictopology_eventReduced via domain-normal form

Verified Invariants

Consensus Invariants

InvariantLocation
InvariantUniqueCommitPerInstanceconsensus/core.qnt
InvariantCommitRequiresThresholdconsensus/core.qnt
InvariantCommittedHasCommitFactconsensus/core.qnt
InvariantEquivocatorsExcludedconsensus/core.qnt
InvariantProposalsFromWitnessesconsensus/core.qnt
InvariantProgressUnderSynchronyconsensus/liveness.qnt
InvariantRetryBoundconsensus/liveness.qnt
InvariantCommitRequiresHonestParticipationconsensus/liveness.qnt
InvariantQuorumPossibleconsensus/liveness.qnt
InvariantByzantineThresholdconsensus/adversary.qnt
InvariantEquivocationDetectedconsensus/adversary.qnt
InvariantCompromisedNoncesExcludedconsensus/adversary.qnt
InvariantHonestMajorityCanCommitconsensus/adversary.qnt

Journal Invariants

InvariantLocation
InvariantNonceUniquejournal/core.qnt
InvariantFactsOrderedjournal/core.qnt
InvariantFactsMatchNamespacejournal/core.qnt
InvariantLifecycleCompletedImpliesStablejournal/core.qnt
InvariantNonceMergeCommutativejournal/core.qnt
InvariantLamportMonotonicjournal/core.qnt
InvariantReduceDeterministicjournal/core.qnt
InvariantPhaseRegisteredjournal/counter.qnt
InvariantCountersRegisteredjournal/counter.qnt
InvariantLifecycleStatusDefinedjournal/counter.qnt
InvariantOutcomeWhenCompletedjournal/counter.qnt
InvariantFactsMonotonicjournal/anti_entropy.qnt
InvariantFactsSubsetOfGlobaljournal/anti_entropy.qnt
InvariantVectorClockConsistentjournal/anti_entropy.qnt
InvariantEventualConvergencejournal/anti_entropy.qnt
InvariantDeltasFromSourcejournal/anti_entropy.qnt
InvariantCompletedSessionsConvergedjournal/anti_entropy.qnt

Temporal Properties

PropertyLocation
livenessEventualCommitconsensus/core.qnt
safetyImmutableCommitconsensus/core.qnt
authorizationSoundnessauthorization.qnt
budgetMonotonicityauthorization.qnt
flowBudgetFairnessauthorization.qnt
canAlwaysExittui/state.qnt
modalEventuallyClosestui/state.qnt
insertModeEventuallyExitstui/state.qnt
InvariantLeakageBoundedleakage.qnt
InvariantObserverHierarchyMaintainedleakage.qnt
InvariantBudgetsPositiveleakage.qnt

Contract Coverage Mapping

This section maps the contract clauses in Privacy and Information Flow Contract and Distributed Systems Contract to the current verification and assurance evidence.

Coverage status uses three classes:

  • Verified: directly covered by Quint invariants, Lean proofs, or both
  • Conformance-backed: covered by replay, parity, or deterministic conformance lanes rather than domain theorem proofs
  • Specified only: documented as a contract requirement, but not yet directly mapped to a proof or conformance artifact in this report

Privacy and Information Flow Contract Coverage

Contract AreaStatusEvidence
Context-specific identity separationVerifiedAura.Proofs.KeyDerivation, Aura.Proofs.ContextIsolation
Budgeted send invariantVerifiedauthorization.qnt, transport.qnt, Aura.Proofs.FlowBudget, Aura.Proofs.GuardChain
Epoch-scoped receipt validityVerifiedepochs.qnt
Observer-budgeted metadata leakageVerifiedleakage.qnt
Cross-context isolationVerifiedAura.Proofs.ContextIsolation, transport.qnt
Physical vs logical time privacy boundaryVerifiedtime_system.qnt, Aura.Proofs.TimeSystem
Error-channel privacy boundarySpecified onlyNo direct proof or conformance mapping recorded here
Retrieval not identity-addressedConformance-backedcrates/aura-simulator/tests/adaptive_privacy_phase_six.rs, selector-based Hold retrieval scenarios
Custody remains opaque and non-authoritativeConformance-backedcrates/aura-simulator/tests/adaptive_privacy_phase_six.rs, Hold custody and cache-replica validation
Accountability evidence verified before local consequencesConformance-backedcrates/aura-simulator/tests/adaptive_privacy_phase_six.rs, reply-block accountability control-plane lanes
External observer protection level varies by deployment modeSpecified onlyNo direct proof or conformance mapping recorded here

Distributed Systems Contract Coverage

Contract AreaStatusEvidence
Journal CRDT lawsVerifiedjournal/core.qnt, Aura.Proofs.Journal
Consensus agreement and validityVerifiedconsensus/core.qnt, Aura.Proofs.Consensus.Agreement, Aura.Proofs.Consensus.Validity
Fault-bound consensus safety assumptionsVerifiedconsensus/adversary.qnt, Aura.Proofs.Consensus.Adversary
Evidence CRDT lawsVerifiedAura.Proofs.Consensus.Evidence
Equivocation detectionVerifiedconsensus/adversary.qnt, Aura.Proofs.Consensus.Equivocation
FROST threshold safetyVerifiedconsensus/frost.qnt, Aura.Proofs.Consensus.Frost
Context isolationVerifiedtransport.qnt, Aura.Proofs.ContextIsolation
Anti-entropy convergenceVerifiedjournal/anti_entropy.qnt
Fast-path and fallback liveness under assumptionsVerifiedconsensus/liveness.qnt, Aura.Proofs.Consensus.Liveness
Invitation lifecycle safetyVerifiedinvitation.qnt
Cross-protocol deadlock freedomVerifiedinteraction.qnt
Operation-scoped and journal consistency modelVerifiedjournal/core.qnt, journal/anti_entropy.qnt, consensus/core.qnt
Runtime conformance against formal artifactsConformance-backedITF trace replay, Telltale parity, conformance fixtures
Onion accountability witness return and verificationConformance-backedSimulator Telltale parity and reply_block_accountability control-plane scenarios
Hold availability and custody-failure boundariesConformance-backedcrates/aura-simulator/tests/adaptive_privacy_phase_six.rs, weak-connectivity and sparse-sync Hold scenarios
Failure-class boundaries and local-only failureSpecified onlyNo direct proof or conformance mapping recorded here
Error-channel privacy requirementsSpecified onlyNo direct proof or conformance mapping recorded here

CI Verification Gates

Automated verification lanes wired into CI pipelines.

Core Verification

GateCommandPurpose
Property Monitorjust ci-property-monitorRuntime property assertion monitoring
Simulator Telltale Parityjust ci-simulator-telltale-parityArtifact-driven telltale vs Aura simulator differential comparison
Choreography Parityjust ci-choreo-paritySession type projection consistency
Quint Typecheckjust ci-quint-typecheckQuint specification type safety

Conformance Gates

GateCommandPurpose
Conformance Policyjust ci-conformance-policyPolicy rule validation
Conformance Contractsjust ci-conformance-contractsContract satisfaction checks

Formal Methods

GateCommandPurpose
Lean Buildjust ci-lean-buildCompile Lean proofs
Lean Completenessjust ci-lean-check-sorryCheck for incomplete proofs (sorry)
Telltale Bridgejust ci-telltale-bridgeCross-validation between Lean and Quint
Kani BMCjust ci-kaniBounded model checking for unsafe code

CI Artifacts

Conformance artifacts upload to CI for failure triage:

artifacts/conformance/
├── native_coop/
│   └── scenario_seed_artifact.json
├── wasm_coop/
│   └── scenario_seed_artifact.json
└── diff_report.json

The diff report highlights specific mismatches for investigation.

Telltale parity and bridge lanes emit additional artifacts:

artifacts/telltale-parity/
└── report.json

artifacts/telltale-bridge/
├── bridge.log
├── bridge_discrepancy_report.json
└── report.json

artifacts/telltale-parity/report.json uses schema aura.telltale-parity.report.v1. artifacts/telltale-bridge/bridge_discrepancy_report.json uses schema aura.telltale-bridge.discrepancy.v1.

Bridge Pipeline Fixtures

aura-quint bridge pipeline checks use deterministic fixture inputs:

FixturePurpose
positive_bundle.jsonExpected consistent cross-validation outcome
negative_bundle.jsonExpected discrepancy detection outcome
quint_ir_fixture.jsonExport/import pipeline coverage for Quint IR

Fixtures live in crates/aura-quint/tests/fixtures/bridge/.

Project Structure

This document provides the authoritative reference for Aura's crate organization, dependencies, and development policies.

The primary specifications live in docs/ (e.g., consensus in docs/108_consensus.md, ceremony lifecycles in docs/109_operation_categories.md). Non-authoritative scratch notes may be removed at any time.

The repo-wide ownership taxonomy is defined in Ownership Model. This document specifies how those ownership rules map onto Aura's crate and layer structure.

8-Layer Architecture

Aura's codebase is organized into 8 clean architectural layers. Each layer builds on the layers below without circular dependencies.

┌────────────────────────────────────────────────────┐
│ Layer 8: Testing & Development Tools               │
│   • aura-testkit    • aura-quint    • aura-harness │
├────────────────────────────────────────────────────┤
│ Layer 7: User Interface                            │
│   • aura-terminal    • aura-ui       • aura-web    │
├────────────────────────────────────────────────────┤
│ Layer 6: Runtime Composition                       │
│   • aura-agent    • aura-simulator    • aura-app   │
├────────────────────────────────────────────────────┤
│ Layer 5: Feature/Protocol Implementation           │
│   • aura-authentication    • aura-chat             │
│   • aura-invitation        • aura-recovery         │
│   • aura-relational        • aura-rendezvous       │
│   • aura-sync              • aura-social           │
├────────────────────────────────────────────────────┤
│ Layer 4: Orchestration                             │
│   • aura-protocol          • aura-guards           │
│   • aura-consensus         • aura-amp              │
│   • aura-anti-entropy                              │
├────────────────────────────────────────────────────┤
│ Layer 3: Implementation                            │
│   • aura-effects           • aura-composition      │
├────────────────────────────────────────────────────┤
│ Layer 2: Specification                             │
│   Domain Crates:                                   │
│   • aura-journal           • aura-authorization    │
│   • aura-signature         • aura-store            │
│   • aura-transport         • aura-maintenance      │
│   Choreography:                                    │
│   • aura-mpst              • aura-macros           │
├────────────────────────────────────────────────────┤
│ Layer 1: Foundation                                │
│   • aura-core                                      │
└────────────────────────────────────────────────────┘

Repo-Wide Ownership Categories

Aura uses four ownership categories:

  • Pure
  • MoveOwned
  • ActorOwned
  • Observed

These categories are architectural, not stylistic.

  • Pure code is deterministic and effect-free.
  • MoveOwned code represents exclusive authority through consumed values such as handles, owner tokens, and transfer records.
  • ActorOwned code owns long-lived mutable async state under one live task or supervisor.
  • Observed code reads authoritative state and may submit typed commands, but must not author semantic truth.

Two repo-wide rules apply across every layer:

  1. Parity-critical mutation and publication must be capability-gated.
  2. Parity-critical operations must end in typed terminal success, failure, or cancellation.

For Layer 6 runtime code this means:

  • aura-agent owns the production structured-concurrency model
  • long-lived runtime services use actor-owned bounded ingress
  • session/delegation/fragment transfer stays move-owned even when hosted by runtime services
  • raw task spawn is implementation detail inside the sanctioned supervisor boundary, not a public runtime programming model

If a parity-critical subsystem cannot explain who owns state, how ownership is transferred, and how failure terminates, the architecture is incomplete.

Layer 1: Foundation (aura-core)

Purpose: Single source of truth for all domain concepts and interfaces.

Contains:

  • Effect traits for core infrastructure, authentication, storage, network, cryptography, privacy, configuration, and testing
  • Domain types: AuthorityId, ContextId, SessionId, FlowBudget, ObserverClass, Capability
  • Cryptographic utilities: key derivation, FROST types, merkle trees, Ed25519 helpers
  • Semantic traits: JoinSemilattice, MeetSemilattice, CvState, MvState
  • Error types: AuraError, error codes, and guard metadata
  • Configuration system with validation and multiple formats
  • Causal context types for CRDT ordering
  • AMP channel lifecycle effect surface: aura-core::effects::amp::AmpChannelEffects (implemented by runtime, simulator, and testkit mocks).

Key principle: Interfaces only, no implementations or business logic.

Ownership expectation:

  • primarily Pure
  • defines the shared ownership, capability, and terminality vocabulary used by higher layers
  • does not own long-lived mutable runtime state
  • provides capability-gated boundaries rather than bypassing them
  • the canonical ownership/runtime API lives in aura-core::ownership and is explicitly split into actor_owned::*, move_owned::*, and capability_gated::*

Exceptions:

  1. Extension traits providing convenience methods are allowed (e.g., LeakageChoreographyExt, SimulationEffects, AuthorityRelationalEffects). These blanket implementations extend existing effect traits with domain-specific convenience methods while maintaining interface-only semantics.

  2. Arc blanket implementations for effect traits are required in aura-core due to Rust's orphan rules. These are not "runtime instantiations" - they are purely mechanical delegations that enable Arc<AuraEffectSystem> to satisfy trait bounds. Example:

    #![allow(unused)]
    fn main() {
    impl<T: CryptoEffects + ?Sized> CryptoEffects for std::sync::Arc<T> {
        async fn ed25519_sign(&self, msg: &[u8], key: &[u8]) -> Result<Vec<u8>, CryptoError> {
            (**self).ed25519_sign(msg, key).await  // Pure delegation
        }
    }
    }

    Arc is a language-level primitive (like Vec, Box, or &), not a "runtime" in the architectural sense. These implementations add no behavior or state. They simply say "if T can do X, then Arc can too by asking T." Without these, any handler wrapped in Arc would fail to satisfy effect trait bounds, breaking the entire dependency injection pattern.

Architectural compliance: aura-core maintains strict interface-only semantics. Test utilities like MockEffects are provided in aura-testkit (Layer 8) where they architecturally belong.

Dependencies: None (foundation crate).

Commitment Tree Types and Functions

Location: aura-core/src/tree/

Contains:

  • Core tree types: TreeOp, AttestedOp, Policy, LeafNode, BranchNode, BranchSigningKey, TreeCommitment, Epoch
  • Commitment functions: commit_branch(), commit_leaf(), policy_hash(), compute_root_commitment()
  • Policy meet-semilattice implementation for threshold refinement
  • Snapshot types: Snapshot, Cut, ProposalId, Partial
  • Verification module: verify_attested_op() (cryptographic), check_attested_op() (state consistency), compute_binding_message()

Why Layer 1?

Commitment tree types MUST remain in aura-core because:

  1. Effect traits require them: TreeEffects and SyncEffects in aura-core/src/effects/ use these types in their signatures
  2. FROST primitives depend on them: aura-core/src/crypto/tree_signing.rs implements threshold signing over tree operations
  3. Authority abstraction needs them: aura-core/src/authority.rs uses Policy, AttestedOp, and TreeOpKind
  4. Foundational cryptographic structures: Commitment trees are merkle trees with threshold policies - core cryptographic primitives, not domain logic

Layer 2 separation (aura-journal) contains:

  • Tree state machine: Full TreeState with branches, leaves, topology, and path validation
  • Reduction logic: Deterministic state derivation from OpLog<AttestedOp>
  • Domain validation: Business rules for tree operations (e.g., policy monotonicity, leaf lifecycle)
  • Application logic: apply_verified(), compaction, garbage collection
  • Re-exports: pub use aura_core::tree::* for convenience via aura_journal::commitment_tree

Key architectural distinction:

  • Layer 1 (aura-core): Tree types and cryptographic commitment functions (pure primitives)
  • Layer 2 (aura-journal): Tree state machine, CRDT semantics, and validation rules (domain implementation)

This separation allows effect traits in Layer 1 to reference tree types without creating circular dependencies, while keeping the stateful CRDT logic in the appropriate domain crate.

Layer 2: Specification (Domain Crates and Choreography)

Purpose: Define domain semantics and protocol specifications.

Ownership expectation:

  • default to Pure
  • use MoveOwned only when transfer semantics are part of the domain model
  • avoid ActorOwned runtime-style state
  • expose typed results and typed domain failures rather than implicit failure

Layer 2 Architecture Diff (Invariants)

Layer 2 is the specification layer: pure domain semantics with zero runtime coupling.

Must hold:

  • No handler composition, runtime assembly, or UI dependencies.
  • Domain facts are versioned and encoded via canonical DAG-CBOR.
  • Fact reducers register through FactRegistry. No direct wiring in aura-journal.
  • Authorization scopes use aura-core ResourceScope and typed operations.
  • No in-memory production state. Stateful test handlers live in aura-testkit.

Forbidden:

  • Direct OS access (time, fs, network) outside effect traits.
  • Tokio/async-std usage in domain protocols.
  • State-bearing singletons or process-wide caches.

Domain Crates

CrateDomainResponsibility
aura-journalFact-based journalCRDT semantics, tree state machine, reduction logic, validation (re-exports tree types from aura-core)
aura-authorizationTrust and authorizationCapability refinement, Biscuit token helpers
aura-signatureIdentity semanticsSignature verification, device lifecycle
aura-storeStorage domainStorage types, capabilities, domain logic
aura-transportTransport semanticsP2P communication abstractions
aura-maintenanceMaintenance factsSnapshot, cache invalidation, OTA activation, admin replacement facts + reducer

Key characteristics: Implement domain logic without effect handlers or coordination.

Extensible Fact Types (aura-journal)

The journal provides generic fact infrastructure that higher-level crates extend with domain-specific fact types. This follows the Open/Closed Principle: the journal is open for extension but closed for modification.

Protocol-Level vs Domain-Level Facts

The RelationalFact enum in aura-journal/src/fact.rs contains two categories:

Protocol-Level Facts (stay in aura-journal):

These are core protocol constructs with complex reduction logic in reduce_context(). They have interdependencies and specialized state derivation that cannot be delegated to simple domain reducers:

FactPurposeWhy Protocol-Level
Protocol(GuardianBinding)Guardian relationshipCore recovery protocol
Protocol(RecoveryGrant)Recovery capabilityCore recovery protocol
Protocol(Consensus)Aura Consensus resultsCore agreement mechanism
Protocol(AmpChannelCheckpoint)Ratchet window anchoringComplex epoch state computation
Protocol(AmpProposedChannelEpochBump)Optimistic epoch transitionsSpacing rules, bump selection
Protocol(AmpCommittedChannelEpochBump)Finalized epoch transitionsEpoch chain validation
Protocol(AmpChannelPolicy)Channel policy overridesSkip window derivation

Domain-Level Facts (via Generic + FactRegistry):

Application-specific facts use RelationalFact::Generic and are reduced by registered FactReducer implementations.

Domain CrateFact TypePurpose
aura-chatChatFactChannels, messages
aura-invitationInvitationFactInvitation lifecycle
aura-relationalContactFactContact management
aura-social/moderationBlock*FactBlock, mute, ban, kick

Design Pattern:

  1. aura-journal provides:

    • DomainFact trait for fact type identity and serialization
    • FactReducer trait for domain-specific reduction logic
    • FactRegistry for runtime fact type registration
    • RelationalFact::Generic as the extensibility mechanism
  2. Domain crates implement:

    • Their own typed fact enums (e.g., ChatFact, InvitationFact)
    • DomainFact trait with to_generic() for storage
    • FactReducer for reduction to RelationalBinding
  3. aura-agent/src/fact_registry.rs registers all domain reducers:

    #![allow(unused)]
    fn main() {
    pub fn build_fact_registry() -> FactRegistry {
        let mut registry = FactRegistry::new();
        registry.register::<ChatFact>(CHAT_FACT_TYPE_ID, Box::new(ChatFactReducer));
        registry.register::<InvitationFact>(INVITATION_FACT_TYPE_ID, Box::new(InvitationFactReducer));
        registry.register::<ContactFact>(CONTACT_FACT_TYPE_ID, Box::new(ContactFactReducer));
        register_moderation_facts(&mut registry);
        registry
    }
    }

Why This Architecture:

  • Open/Closed Principle: New domain facts don't require modifying aura-journal
  • Domain Isolation: Each crate owns its fact semantics
  • Protocol Integrity: Core protocol facts with complex reduction stay in aura-journal
  • Testability: Domain facts can be tested independently
  • Type Safety: Compile-time guarantees within each domain

Core Fact Types in aura-journal: Only facts fundamental to journal operation remain as direct enum variants:

  • AttestedOp: Commitment tree operations (cryptographic primitives)
  • Snapshot: Journal compaction checkpoints
  • RendezvousReceipt: Cross-authority coordination receipts
  • Protocol-level RelationalFact variants listed above

Fact Implementation Patterns by Layer

Aura uses two distinct fact patterns based on architectural layer to prevent circular dependencies:

Layer 2 Pattern (Domain Crates: aura-maintenance, aura-authorization, aura-signature, aura-store, aura-transport):

These crates use the aura-core::types::facts pattern with no dependency on aura-journal:

#![allow(unused)]
fn main() {
use aura_core::types::facts::{FactTypeId, FactError, FactEnvelope, FactDeltaReducer};

pub static MY_FACT_TYPE_ID: FactTypeId = FactTypeId::new("my_domain");
pub const MY_FACT_SCHEMA_VERSION: u16 = 1;

impl MyFact {
    pub fn try_encode(&self) -> Result<Vec<u8>, FactError> {
        aura_core::types::facts::try_encode_fact(
            &MY_FACT_TYPE_ID,
            MY_FACT_SCHEMA_VERSION,
            self,
        )
    }

    pub fn to_envelope(&self) -> Result<FactEnvelope, FactError> {
        // Create envelope manually for Generic wrapping
    }
}

impl FactDeltaReducer<MyFact, MyFactDelta> for MyFactReducer {
    fn apply(&self, fact: &MyFact) -> MyFactDelta { /* ... */ }
}
}

Layer 4/5 Pattern (Feature Crates: aura-chat, aura-invitation, aura-relational, aura-recovery, aura-social):

These crates use the aura-journal::extensibility::DomainFact pattern and register with FactRegistry:

#![allow(unused)]
fn main() {
use aura_journal::extensibility::{DomainFact, FactReducer};
use aura_macros::DomainFact;

#[derive(DomainFact)]
#[domain_fact(type_id = "my_domain", schema_version = 1, context_fn = "context_id")]
pub enum MyFact { /* ... */ }

impl FactReducer for MyFactReducer {
    fn handles_type(&self) -> &'static str { /* ... */ }
    fn reduce_envelope(...) -> Option<RelationalBinding> { /* ... */ }
}
}

Why Two Patterns?

  • Layer 2 → Layer 2 dependencies create circular risk: aura-journal is itself a Layer 2 crate. If other Layer 2 crates depend on aura-journal for the DomainFact trait, we risk circular dependencies.
  • Layer 4/5 can safely depend on Layer 2: Higher layers depend on lower layers by design, so feature crates can use the DomainFact trait from aura-journal.
  • Registration location differs: Layer 2 facts are wrapped manually in RelationalFact::Generic. Layer 4/5 facts register with FactRegistry in aura-agent/src/fact_registry.rs.

For a quick decision tree on pattern selection, see CLAUDE.md under "Agent Decision Aids".

Choreography Specification

aura-mpst: Aura-facing boundary crate over Telltale's public language, runtime, and type surfaces. Re-exports choreography/runtime surfaces and Aura extension traits used by generated protocols, VM artifacts, and testing utilities, including canonical choreography capability parsing into validated CapabilityName values.

aura-macros: Compile-time choreography frontend. Parses Aura annotations (guard_capability, flow_cost, journal_facts, leak), generates typed first-party capability-family surfaces, and emits Telltale-backed generated modules plus Aura effect-bridge helpers.

Layer 3: Implementation (aura-effects and aura-composition)

Purpose: Effect implementation and handler composition.

Ownership expectation:

  • aura-effects handlers remain stateless by default
  • aura-composition may assemble and configure systems, but should not grow ad hoc authority ownership or hidden semantic lifecycle
  • capability checks should flow through shared contracts, not handler-local shortcuts

aura-effects: Stateless Effect Handlers

Purpose: Stateless, single-party effect implementations.

aura-effects is the designated singular point of interaction with non-deterministic operating system services (entropy, wall-clock time, network I/O, file system). This design choice makes the architectural boundary explicit and centralizes impure operations.

Contains:

  • Production handlers: RealCryptoHandler, TcpTransportHandler, FilesystemStorageHandler, PhysicalTimeHandler
  • OS integration adapters that delegate to system services
  • Pure functions that transform inputs to outputs without state

What doesn't go here:

  • Handler composition or registries
  • Multi-handler coordination
  • Stateful implementations
  • Mock/test handlers

Key characteristics: Each handler should be independently testable and reusable. No handler should know about other handlers. This enables clean dependency injection and modular testing.

Dependencies: aura-core and external libraries.

Note: Mock and test handlers are located in aura-testkit (Layer 8) to maintain clean separation between production and testing concerns.

aura-composition: Handler Composition

Purpose: Assemble individual handlers into cohesive effect systems.

Contains:

  • Effect registry and builder patterns
  • Handler composition utilities
  • Effect system configuration
  • Handler lifecycle management (start/stop/configure)
  • Reactive infrastructure: Dynamic<T> FRP primitives for composing view updates over effect changes

What doesn't go here:

  • Individual handler implementations
  • Multi-party protocol logic
  • Runtime-specific concerns
  • Application lifecycle

Key characteristics: Feature crates need to compose handlers without pulling in full runtime infrastructure. This is about "how do I assemble handlers?" not "how do I coordinate distributed protocols?"

Dependencies: aura-core, aura-effects.

Layer 4: Orchestration (aura-protocol and subcrates)

Purpose: Multi-party coordination and distributed protocol orchestration.

Ownership expectation:

  • use MoveOwned for delegation, session ownership, endpoint transfer, and stale-owner invalidation
  • use ActorOwned only for justified long-lived orchestration state
  • coordination authority and semantic publication should be capability-gated
  • async orchestration flows should have typed terminal outcomes

Contains:

  • Guard chain coordination (CapGuard → FlowGuard → JournalCoupler) in aura-guards
  • Multi-party protocol orchestration (consensus in aura-consensus, anti-entropy in aura-anti-entropy)
  • Quorum-driven DKG orchestration and transcript handling in aura-consensus/src/dkg/
  • Cross-handler transport and storage coordination logic
  • Distributed state management
  • Stateful coordinators for multi-party protocols

What doesn't go here:

  • Effect trait definitions (all traits belong in aura-core)
  • Handler composition infrastructure (belongs in aura-composition)
  • Single-party effect implementations (belongs in aura-effects)
  • Test/mock handlers (belong in aura-testkit)
  • Runtime assembly (belongs in aura-agent)
  • Application-specific business logic (belongs in domain crates)

Key characteristics: This layer coordinates multiple handlers working together across network boundaries. It implements the "choreography conductor" pattern, ensuring distributed protocols execute correctly with proper authorization, flow control, and state consistency. All handlers here manage multi-party coordination, not single-party operations.

Dependencies: aura-core, aura-effects, aura-composition, aura-mpst, domain crates, and Layer 4 subcrates (aura-guards, aura-consensus, aura-amp, aura-anti-entropy). Performance-critical protocol operations may require carefully documented exceptions for direct cryptographic library usage.

Layer 5: Feature/Protocol Implementation

Purpose: Complete end-to-end protocol implementations.

Ownership expectation:

  • feature workflows should have one authoritative semantic owner
  • wrappers, adapters, and compatibility helpers must not publish stronger semantics than canonical workflows
  • long-running feature operations need typed lifecycle and typed terminal failure
  • parity-critical mutation/publication must be capability-gated

Crates:

CrateProtocolPurpose
aura-authenticationAuthenticationDevice, threshold, and guardian auth flows
aura-chatChatChat domain facts and view reducers. Local chat prototype.
aura-invitationInvitationsPeer onboarding and relational facts
aura-recoveryGuardian recoveryRecovery grants and dispute escalation
aura-relationalCross-authority relationshipsRelationalContext protocols (domain types in aura-core)
aura-rendezvousPeer discoveryContext-scoped rendezvous and routing
aura-socialSocial topologyBlock/neighborhood materialized views, relay selection, progressive discovery layers, role/access semantics (Member/Participant, Full/Partial/Limited)
aura-syncSynchronizationJournal sync and anti-entropy protocols

Key characteristics: Reusable building blocks with no UI or binary entry points.

Notes:

  • Layer 5 crates now include ARCHITECTURE.md describing facts, invariants, and operation categories.
  • OPERATION_CATEGORIES constants in each Layer 5 crate map operations to A/B/C classes.
  • Runtime-owned caches (e.g., invitation/rendezvous descriptors) live in Layer 6 handlers, not in Layer 5 services.
  • Layer 5 facts use versioned binary encoding (bincode) with JSON fallback for debug and compatibility.
  • FactKey helper types are required for reducers/views to keep binding key derivation consistent.
  • Ceremony facts carry optional trace_id values to support cross-protocol traceability.

Dependencies: aura-core, aura-effects, aura-composition, aura-mpst, plus Layer 4 orchestration crates (aura-protocol, aura-guards, aura-consensus, aura-amp, aura-anti-entropy).

Layer 6: Runtime Composition (aura-agent, aura-simulator, and aura-app)

Purpose: Assemble complete running systems for production deployment.

Ownership expectation:

  • this is the primary ActorOwned layer
  • runtime services, supervisors, readiness coordinators, and caches should be actor-owned
  • ownership transfer still uses MoveOwned handoff surfaces rather than direct shared mutable rewrites
  • runtime mutation and publication should require explicit capabilities
  • long-running operations and services must have typed terminal lifecycle

aura-agent: Production runtime for deployment with application lifecycle management, runtime-specific configuration, production deployment concerns, and system integration.

aura-app: Portable headless application core providing the business logic and state management layer for all platforms. Exposes a platform-agnostic API consumed by terminal, iOS, Android, and web frontends. Contains intent processing, view derivation, and platform feature flags (native, ios, android, web-js, web-dominator).

aura-simulator: Deterministic simulation runtime with virtual time, transport shims, failure injection, and generative testing via Quint integration (see aura-simulator/src/quint/ for generative simulation bridge).

Contains:

  • Application lifecycle management (startup, shutdown, signals)
  • Runtime-specific configuration and policies
  • Production deployment concerns
  • System integration and monitoring hooks
  • Reactive event loop: ReactiveScheduler (Tokio task) that orchestrates fact ingestion, journal updates, and view propagation

What doesn't go here:

  • Effect handler implementations
  • Handler composition utilities
  • Protocol coordination logic
  • CLI or UI concerns

Key characteristics: This is about "how do I deploy and run this as a production system?" It's the bridge between composed handlers/protocols and actual running applications.

Dependencies: All domain crates, aura-effects, aura-composition, and Layer 4 orchestration crates (aura-protocol, aura-guards, aura-consensus, aura-amp, aura-anti-entropy).

Layer 7: User Interface (aura-terminal, aura-ui, and aura-web)

Purpose: User-facing applications with main entry points.

Ownership expectation:

  • primarily Observed
  • command ingress mechanics may be ActorOwned at the shell boundary
  • UI must not co-author parity-critical semantic lifecycle or readiness truth
  • render and projection layers consume authoritative state rather than inventing it

aura-terminal: Terminal-based interface combining CLI commands and an interactive TUI (Terminal User Interface). Provides account and device management, recovery status visualization, chat interfaces, and scenario execution. Consumes aura-app for all business logic and state management.

aura-ui: Shared Dioxus UI core. Owns cross-frontend semantic snapshot shaping, deterministic keyboard routing, and shared presentation state, while remaining platform agnostic and downstream of aura-app's authoritative semantic contract.

aura-web: Browser/WASM shell. Mounts aura-ui, integrates browser-specific adapters and harness bridge surfaces, and remains an observed shell rather than a semantic owner for parity-critical workflows.

Key characteristics:

  • Layer 7 includes both shell binaries (aura-terminal, aura-web) and the shared UI core (aura-ui) they consume.
  • aura-terminal and aura-web contain user-facing entry points and platform interop.
  • aura-ui stays platform neutral and expresses shared UI structure over aura-app contracts.

Dependencies:

  • aura-ui: aura-app, aura-core
  • aura-terminal: aura-app, aura-agent, aura-core, aura-recovery, and selected Layer 4/5 crates needed for terminal shell integration
  • aura-web: aura-ui, aura-app, aura-agent, aura-core, aura-effects, and browser-only ecosystem crates for WASM interop

Layer 8: Testing and Development Tools

Purpose: Cross-cutting test utilities, formal verification bridges, and generative testing infrastructure.

Ownership expectation:

  • test infrastructure may simulate ActorOwned and MoveOwned systems
  • parity-critical lanes must still respect production ownership boundaries
  • harness and diagnostics are primarily Observed
  • test-only mutation shortcuts should be narrow, documented, and capability-aware

aura-testkit: Comprehensive testing infrastructure including:

  • Shared test fixtures and scenario builders
  • Property test helpers and deterministic utilities
  • Mock effect handlers: MockCryptoHandler, SimulatedTimeHandler, MemoryStorageHandler, etc.
  • Stateful test handlers that maintain controllable state for deterministic testing

aura-quint: Formal verification bridge to Quint model checker including:

  • Native Quint subprocess interface for parsing and type checking
  • Property specification management with classification (authorization, budget, integrity)
  • Verification runner with caching and counterexample generation
  • Effect trait implementations for property evaluation during simulation

aura-harness: Multi-instance runtime harness for orchestrating test scenarios including:

  • Coordinator and executor for managing multiple Aura instances
  • Scenario definition and replay capabilities
  • Artifact synchronization and determinism validation
  • Screen normalization and VT100 terminal emulation for TUI testing
  • Shared semantic contracts from aura-app for scenario actions and UI snapshots
  • Backend selection for mock, patchbay, and patchbay-vm holepunch lanes
  • Resource guards and capability checking

Shared UX contract ownership is split intentionally:

  • aura-app::ui_contract owns parity-critical ids, action contracts, support declarations, and typed observation shapes
  • Layer 7 frontends consume that contract and export observed semantic projections
  • aura-harness consumes those observed projections and must treat them as side-effect-free authoritative observation surfaces

Deterministic observation policy:

  • parity-critical waits bind to declared readiness, event, or quiescence conditions
  • harness mode may change instrumentation or rendering stability, but not business-flow semantics
  • placeholder IDs, exporter override caches, and heuristic success/event synthesis are not valid parity-critical correctness paths

Key characteristics: Mock handlers in aura-testkit are allowed to be stateful (using Arc<Mutex<>>, etc.) since they need controllable, deterministic state for testing. This maintains the stateless principle for production handlers in aura-effects while enabling comprehensive testing.

Deterministic seed policy: Test construction of AuraEffectSystem must use seeded helper constructors (simulation_for_test*). Do not call legacy testing* or raw simulation* constructors from test code. For multiple instances from the same callsite, use simulation_for_named_test_with_salt(...) so each seed is unique and replayable.

Dependencies: aura-core (for aura-harness). For aura-testkit and aura-quint: aura-agent, aura-composition, aura-journal, aura-transport, aura-core, aura-protocol, aura-guards, aura-consensus, aura-amp, aura-anti-entropy.

Workspace Structure

crates/
├── aura-agent           Runtime composition and agent lifecycle
├── aura-app             Portable headless application core (multi-platform)
├── aura-authentication  Authentication protocols
├── aura-anti-entropy    Anti-entropy sync and reconciliation
├── aura-amp             Authenticated messaging protocol (AMP)
├── aura-chat            Chat facts + local prototype service
├── aura-composition     Handler composition and effect system assembly
├── aura-consensus       Consensus protocol implementation
├── aura-core            Foundation types and effect traits
├── aura-effects         Effect handler implementations
├── aura-guards          Guard chain enforcement
├── aura-harness         Multi-instance runtime harness
├── aura-invitation      Invitation choreographies
├── aura-journal         Fact-based journal domain
├── aura-macros          Choreography DSL compiler
├── aura-maintenance     Maintenance facts and reducers
├── aura-mpst            Session types and choreography specs
├── aura-protocol        Orchestration and coordination
├── aura-quint           Quint formal verification
├── aura-recovery        Guardian recovery protocols
├── aura-relational      Cross-authority relationships
├── aura-rendezvous      Peer discovery and routing
├── aura-simulator       Deterministic simulation engine
├── aura-social          Social topology and progressive disclosure
├── aura-store           Storage domain types
├── aura-sync            Synchronization protocols
├── aura-terminal        Terminal UI (CLI + TUI)
├── aura-testkit         Testing utilities and fixtures
├── aura-transport       P2P communication layer
├── aura-ui              Shared Dioxus UI core
├── aura-web             Browser frontend and harness bridge
├── aura-signature       Identity verification
└── aura-authorization   Web-of-trust authorization

Dependency Graph

graph TD
    %% Layer 1: Foundation
    core[aura-core]

    %% Layer 2: Specification
    signature[aura-signature]
    journal[aura-journal]
    authorization[aura-authorization]
    store[aura-store]
    transport[aura-transport]
    mpst[aura-mpst]
    macros[aura-macros]
    maintenance[aura-maintenance]

    %% Layer 3: Implementation
    effects[aura-effects]
    composition[aura-composition]

    %% Layer 4: Orchestration
    guards[aura-guards]
    anti_entropy[aura-anti-entropy]
    consensus[aura-consensus]
    amp[aura-amp]
    protocol[aura-protocol]

    %% Layer 5: Feature
    social[aura-social]
    chat[aura-chat]
    relational[aura-relational]
    auth[aura-authentication]
    rendezvous[aura-rendezvous]
    invitation[aura-invitation]
    recovery[aura-recovery]
    sync[aura-sync]

    %% Layer 6: Runtime
    app[aura-app]
    agent[aura-agent]
    simulator[aura-simulator]

    %% Layer 7: Application
    ui[aura-ui]
    terminal[aura-terminal]
    web[aura-web]

    %% Layer 8: Testing
    testkit[aura-testkit]
    quint[aura-quint]
    harness[aura-harness]

    %% Layer 2 dependencies
    signature --> core
    journal --> core
    authorization --> core
    store --> core
    transport --> core
    mpst --> core
    macros --> core
    maintenance --> core

    %% Layer 3 dependencies
    effects --> core
    composition --> core
    composition --> effects
    composition --> mpst

    %% Layer 4 dependencies
    guards --> core
    guards --> authorization
    guards --> mpst
    guards --> journal
    anti_entropy --> core
    anti_entropy --> guards
    anti_entropy --> journal
    consensus --> core
    consensus --> macros
    consensus --> journal
    consensus --> mpst
    consensus --> guards
    amp --> core
    amp --> effects
    amp --> journal
    amp --> transport
    amp --> consensus
    amp --> guards
    protocol --> core
    protocol --> effects
    protocol --> composition
    protocol --> journal
    protocol --> guards
    protocol --> consensus
    protocol --> amp
    protocol --> anti_entropy
    protocol --> authorization
    protocol --> transport
    protocol --> mpst
    protocol --> store

    %% Layer 5 dependencies
    social --> core
    social --> journal
    chat --> core
    chat --> journal
    chat --> composition
    chat --> guards
    relational --> core
    relational --> journal
    relational --> consensus
    relational --> effects
    auth --> core
    auth --> effects
    auth --> journal
    auth --> protocol
    auth --> guards
    auth --> relational
    auth --> signature
    auth --> authorization
    rendezvous --> core
    rendezvous --> journal
    rendezvous --> guards
    rendezvous --> social
    invitation --> core
    invitation --> effects
    invitation --> guards
    invitation --> authorization
    invitation --> auth
    invitation --> journal
    invitation --> composition
    recovery --> core
    recovery --> journal
    recovery --> composition
    recovery --> signature
    recovery --> auth
    recovery --> authorization
    recovery --> effects
    recovery --> protocol
    recovery --> relational
    sync --> core
    sync --> protocol
    sync --> guards
    sync --> journal
    sync --> authorization
    sync --> maintenance
    sync --> rendezvous
    sync --> effects
    sync --> anti_entropy

    %% Layer 6 dependencies
    app --> core
    app --> effects
    app --> journal
    app --> relational
    app --> chat
    app --> social
    app --> maintenance
    app --> protocol
    app --> recovery
    agent --> core
    agent --> app
    agent --> effects
    agent --> composition
    agent --> protocol
    agent --> guards
    agent --> consensus
    agent --> journal
    agent --> relational
    agent --> chat
    agent --> auth
    agent --> invitation
    agent --> rendezvous
    agent --> social
    agent --> sync
    agent --> maintenance
    agent --> transport
    agent --> recovery
    agent --> authorization
    agent --> signature
    agent --> store
    simulator --> core
    simulator --> agent
    simulator --> effects
    simulator --> journal
    simulator --> amp
    simulator --> consensus
    simulator --> protocol
    simulator --> testkit
    simulator --> sync
    simulator --> quint
    simulator --> guards

    %% Layer 7 dependencies
    ui --> app
    ui --> core
    terminal --> app
    terminal --> core
    terminal --> agent
    terminal --> protocol
    terminal --> recovery
    terminal --> invitation
    terminal --> auth
    terminal --> sync
    terminal --> effects
    terminal --> authorization
    terminal --> maintenance
    terminal --> chat
    terminal --> journal
    terminal --> relational
    web --> ui
    web --> app
    web --> core
    web --> agent
    web --> effects

    %% Layer 8 dependencies
    testkit --> core
    testkit --> effects
    testkit --> mpst
    testkit --> journal
    testkit --> relational
    testkit --> social
    testkit --> transport
    testkit --> authorization
    testkit --> consensus
    testkit --> anti_entropy
    testkit --> amp
    testkit --> protocol
    testkit --> app
    quint --> core
    quint --> effects
    harness --> core

    %% Styling
    classDef foundation fill:#e1f5fe
    classDef spec fill:#f3e5f5
    classDef impl fill:#e8f5e9
    classDef orch fill:#fff3e0
    classDef feature fill:#fce4ec
    classDef runtime fill:#f1f8e9
    classDef application fill:#e0f2f1
    classDef test fill:#ede7f6

    class core foundation
    class signature,journal,authorization,store,transport,mpst,macros,maintenance spec
    class effects,composition impl
    class guards,anti_entropy,consensus,amp,protocol orch
    class social,chat,relational,auth,rendezvous,invitation,recovery,sync feature
    class app,agent,simulator runtime
    class ui,terminal,web application
    class testkit,quint,harness test

Effect Trait Classification

Not all effect traits are created equal. Aura organizes effect traits into three categories that determine where their implementations should live:

All effect trait definitions belong in aura-core (Layer 1) to maintain a single source of truth for interfaces. This includes infrastructure effects (OS integration), application effects (domain-specific), and protocol coordination effects (multi-party orchestration).

Infrastructure Effects (Implemented in aura-effects)

Infrastructure effects are truly foundational capabilities that every Aura system needs. These traits define OS-level operations that are universal across all Aura use cases.

Characteristics:

  • OS integration (file system, network, cryptographic primitives)
  • No Aura-specific semantics
  • Reusable across any distributed system
  • Required for basic system operation

Examples:

  • CryptoEffects: Ed25519 signing, key generation, hashing
  • NetworkEffects: TCP connections, message sending/receiving
  • StorageEffects: File read/write, directory operations
  • PhysicalTimeEffects, LogicalClockEffects, OrderClockEffects: Unified time system
  • RandomEffects: Cryptographically secure random generation
  • ConfigurationEffects: Configuration file parsing
  • ConsoleEffects: Terminal input/output
  • LeakageEffects: Cross-cutting metadata leakage tracking (composable infrastructure)
  • ReactiveEffects: Type-safe signal-based state management for UI and inter-component communication

Implementation Location: These traits have stateless handlers in aura-effects that delegate to OS services.

Application Effects (Implemented in Domain Crates)

Application effects encode Aura-specific abstractions and business logic. These traits capture domain concepts that are meaningful only within Aura's architecture.

Characteristics:

  • Aura-specific semantics and domain knowledge
  • Built on top of infrastructure effects
  • Implement business logic and domain rules
  • May have multiple implementations for different contexts

Examples:

  • JournalEffects: Fact-based journal operations, specific to Aura's CRDT design (aura-journal)
  • AuthorityEffects: Authority-specific operations, central to Aura's identity model
  • FlowBudgetEffects: Privacy budget management, unique to Aura's information flow control (aura-authorization)
  • AuthorizationEffects: Biscuit token evaluation, tied to Aura's capability system (aura-authorization)
  • RelationalEffects: Cross-authority relationship management
  • GuardianEffects: Recovery protocol operations

Protocol Coordination Effects (new category):

  • ChoreographicEffects: Multi-party protocol coordination
  • EffectApiEffects: Event sourcing and audit for protocols
  • SyncEffects: Anti-entropy synchronization operations

Application effects are implemented in their respective domain crates (aura-journal, aura-authorization, etc.). Protocol coordination effects are implemented in Layer 4 orchestration crates (aura-protocol, aura-guards, aura-consensus, aura-amp, aura-anti-entropy) as they manage multi-party state.

Moving these to aura-effects would create circular dependencies. Domain crates need to implement these effects using their own domain logic, but aura-effects cannot depend on domain crates due to the layered architecture.

Domain crates implement application effects by creating domain-specific handler structs that compose infrastructure effects for OS operations while encoding Aura-specific business logic.

#![allow(unused)]
fn main() {
// Example: aura-journal implements JournalEffects
pub struct JournalHandler<C: CryptoEffects, S: StorageEffects> {
    crypto: C,
    storage: S,
    // Domain-specific state
}

impl<C: CryptoEffects, S: StorageEffects> JournalEffects for JournalHandler<C, S> {
    async fn append_fact(&self, fact: Fact) -> Result<(), AuraError> {
        // 1. Domain validation using Aura-specific rules
        self.validate_fact_semantics(&fact)?;
        
        // 2. Cryptographic operations via infrastructure effects
        let signature = self.crypto.sign(&fact.hash()).await?;
        
        // 3. Storage operations via infrastructure effects  
        let entry = JournalEntry { fact, signature };
        self.storage.write_chunk(&entry.id(), &entry.encode()).await?;
        
        // 4. Domain-specific post-processing
        self.update_fact_indices(&fact).await?;
        Ok(())
    }
}
}

Common Effect Placement Mistakes

Here are examples of incorrect effect placement and how to fix them:

#![allow(unused)]
fn main() {
// WRONG: Domain handler using OS operations directly
// File: aura-journal/src/effects.rs
impl JournalEffects for BadJournalHandler {
    async fn read_facts(&self, namespace: Namespace) -> Vec<Fact> {
        // VIOLATION: Direct file system access in domain handler
        let data = std::fs::read("journal.dat")?;
        serde_json::from_slice(&data)?
    }
}

// CORRECT: Inject StorageEffects for OS operations
impl<S: StorageEffects> JournalEffects for GoodJournalHandler<S> {
    async fn read_facts(&self, namespace: Namespace) -> Vec<Fact> {
        // Use injected storage effects
        let data = self.storage.read_chunk(&namespace.to_path()).await?;
        self.deserialize_facts(data)
    }
}
}
#![allow(unused)]
fn main() {
// WRONG: Application effect implementation in aura-effects
// File: aura-effects/src/journal_handler.rs
pub struct JournalHandler { }

impl JournalEffects for JournalHandler {
    // VIOLATION: Domain logic in infrastructure crate
    async fn validate_fact(&self, fact: &Fact) -> bool {
        match fact {
            Fact::TreeOp(op) => self.validate_tree_semantics(op),
            Fact::Commit(c) => self.validate_commit_rules(c),
        }
    }
}

// CORRECT: Application effects belong in domain crates
// File: aura-journal/src/effects.rs
impl<C, S> JournalEffects for JournalHandler<C, S> {
    // Domain validation logic belongs here
}
}
#![allow(unused)]
fn main() {
// WRONG: Infrastructure effect in domain crate
// File: aura-journal/src/network_handler.rs
pub struct CustomNetworkHandler { }

impl NetworkEffects for CustomNetworkHandler {
    // VIOLATION: OS-level networking in domain crate
    async fn connect(&self, addr: &str) -> TcpStream {
        TcpStream::connect(addr).await?
    }
}

// CORRECT: Use existing NetworkEffects from aura-effects
impl<N: NetworkEffects> MyDomainHandler<N> {
    async fn send_fact(&self, fact: Fact) -> Result<()> {
        // Compose with injected network effects
        self.network.send(fact.encode()).await
    }
}
}

Key principles for domain effect implementations:

  • Domain logic first: Encode business rules and validation specific to the domain
  • Infrastructure composition: Use infrastructure effects for OS operations, never direct syscalls
  • Clean separation: Domain handlers should not contain OS integration code
  • Testability: Mock infrastructure effects for unit testing domain logic

Fallback Handlers and the Null Object Pattern

Infrastructure effects sometimes require fallback implementations for platforms or environments where the underlying capability is unavailable (e.g., biometric hardware on servers, secure enclaves in CI, HSMs in development).

When fallback handlers are appropriate:

  • The effect trait represents optional hardware/OS capabilities
  • Code must run on platforms without the capability
  • Graceful degradation is preferable to compile-time feature flags everywhere

Naming conventions:

  • Good: FallbackBiometricHandler, NoOpSecureEnclaveHandler, UnsupportedHsmHandler
  • Avoid: RealBiometricHandler (misleading - implies real implementation)

Fallback handler behavior:

  • Return false for capability checks (is_available(), supports_feature())
  • Return descriptive errors for operations (Err(NotSupported))
  • Never panic or silently succeed when the capability is unavailable

For a checklist on removing stub handlers, see CLAUDE.md under "Agent Decision Aids".

A fallback handler is not dead code if its trait is actively used. It represents the Null Object Pattern providing safe defaults. The architectural violation is a misleading name, not the existence of the fallback.

Composite Effects (Convenience Extensions)

Composite effects provide convenience methods that combine multiple lower-level operations. These are typically extension traits that add domain-specific convenience to infrastructure effects.

Characteristics:

  • Convenience wrappers around other effects
  • Domain-specific combinations of operations
  • Often implemented as blanket implementations
  • Improve developer ergonomics

Examples:

  • TreeEffects: Combines CryptoEffects and StorageEffects for merkle tree operations
  • SimulationEffects: Testing-specific combinations for deterministic simulation
  • LeakageChoreographyExt: Combines leakage tracking with choreography operations

Implementation Location: Usually implemented as extension traits in aura-core or as blanket implementations in domain crates.

Effect Classification

For quick decision aids (decision matrix, decision tree), see CLAUDE.md under "Agent Decision Aids".

Examples:

  • CryptoEffects → Infrastructure (OS crypto, no Aura semantics, reusable)
  • JournalEffects → Application (Aura facts, domain validation, not reusable)
  • NetworkEffects → Infrastructure (TCP/UDP, no domain logic, reusable)
  • FlowBudgetEffects → Application (Aura privacy model, domain rules)

This classification ensures that:

  • Infrastructure effects have reliable, stateless implementations available in aura-effects
  • Application effects can evolve with their domain logic in domain crates
  • Composite effects provide ergonomic interfaces without architectural violations
  • The dependency graph remains acyclic
  • Domain knowledge stays in domain crates, OS knowledge stays in infrastructure
  • Clean composition enables testing domain logic independently of OS integration

Architecture Principles

No Circular Dependencies

Each layer builds on lower layers without reaching back down. This enables independent testing, reusability, and clear responsibility boundaries.

The layered architecture means that Layer 1 has no dependencies on any other Aura crate. Layer 2 depends only on Layer 1. Layer 3 depends on Layers 1 and 2. This pattern continues through all 8 layers.

Code Location Policy

The 8-layer architecture enforces strict placement rules. Violating these rules creates circular dependencies or breaks architectural invariants.

For a quick reference table of layer rules, see CLAUDE.md under "Agent Decision Aids".

For practical guidance on effects and handlers, see Effects Guide. For choreography development, see Choreography Guide.

Pure Mathematical Utilities

Some effect traits in aura-core (e.g., BloomEffects) represent pure mathematical operations without OS integration. These follow the standard trait/handler pattern for consistency, but are technically not "effects" in the algebraic sense (no side effects).

This is acceptable technical debt - the pattern consistency outweighs the semantic impurity. Future refactoring may move pure math to methods on types in aura-core.

Architectural Compliance Checking

The project includes an automated architectural compliance checker to enforce these layering principles:

Command: just check-arch
Implementation: toolkit/xtask orchestration plus retained thin repo checks

What it validates:

  • Layer boundary violations (no upward dependencies)
  • Dependency direction (Lx→Ly where y≤x only)
  • Effect trait classification and placement
  • Domain effect implementation patterns
  • Stateless handler requirements in aura-effects (no Arc<Mutex>, Arc<RwLock>)
  • Mock handler location in aura-testkit
  • Guard chain integrity (no bypass of CapGuard → FlowGuard → JournalCoupler)
  • Impure function routing through effects (SystemTime::now, thread_rng, etc.)
  • Physical time guardrails (tokio::time::sleep confinement)
  • Handler composition patterns (no direct instantiation)
  • Placeholder/TODO detection
  • Invariants documentation schema validation

The checker reports violations that must be fixed and warnings for review. Run it before submitting changes to ensure architectural compliance.

Feature Flags

Aura uses a minimal set of deliberate feature flags organized into three tiers.

Tier 1: Workspace-Wide Features

FeatureCratePurpose
simulationaura-core, aura-effectsEnables simulation/testing effect traits and handlers. Required by aura-simulator, aura-quint, aura-testkit.
proptestaura-coreProperty-based testing support via the proptest crate.

Tier 2: Platform Features (aura-app)

FeaturePurpose
nativeRust consumers (aura-terminal, tests). Enables futures-signals API.
iosiOS via UniFFI → Swift bindings.
androidAndroid via UniFFI → Kotlin bindings.
wasmWeb via wasm-bindgen → JavaScript/TypeScript.
web-dominatorPure Rust WASM apps using futures-signals + dominator.

Development features: instrumented (tracing), debug-serialize (JSON debug output), host (binding stub).

Tier 3: Crate-Specific Features

CrateFeaturePurpose
aura-terminalterminalTUI mode (default).
aura-terminaldevelopmentIncludes simulator, testkit, debug features.
aura-testkitfull-effect-systemOptional aura-agent integration for higher-layer tests.
aura-testkitleanLean oracle for differential testing against formal models.
aura-agentdev-consoleOptional development console server.
aura-agentreal-android-keystoreReal Android Keystore implementation.
aura-mpstdebugChoreography debugging output.
aura-macrosproc-macroProc-macro compilation (default).

Feature Usage Examples

# Standard development (default features)
cargo build

# With simulation support
cargo build -p aura-core --features simulation

# Terminal with development tools
cargo build -p aura-terminal --features development

# Lean differential testing (requires Lean toolchain)
just lean-oracle-build
cargo test -p aura-testkit --features lean --test lean_differential

# iOS build
cargo build -p aura-app --features ios

Effect System and Impure Function Guidelines

Core Principle: Deterministic Simulation

Aura's effect system ensures fully deterministic simulation by requiring all impure operations (time, randomness, filesystem, network) to flow through effect traits. This enables:

  • Predictable testing: Mock all external dependencies for unit tests
  • WASM compatibility: No blocking operations or OS thread assumptions
  • Cross-platform support: Same code runs in browsers and native environments
  • Simulation fidelity: Virtual time and controlled randomness for property testing

Impure Function Classification

FORBIDDEN: Direct impure function usage

#![allow(unused)]
fn main() {
// VIOLATION: Direct system calls
let now = SystemTime::now();
let random = thread_rng().gen::<u64>();
let file = File::open("data.txt")?;
let socket = TcpStream::connect("127.0.0.1:8080").await?;

// VIOLATION: Global state
static CACHE: Mutex<HashMap<String, String>> = Mutex::new(HashMap::new());
}

REQUIRED: Effect trait usage

#![allow(unused)]
fn main() {
// CORRECT: Via effect traits with explicit context
async fn my_operation<T: TimeEffects + RandomEffects + StorageEffects>(
    ctx: &EffectContext,
    effects: &T,
) -> Result<ProcessedData> {
    let timestamp = effects.current_time().await;
    let nonce = effects.random_bytes(32).await?;
    let data = effects.read_chunk(&chunk_id).await?;
    
    // ... business logic with pure functions
    Ok(ProcessedData { timestamp, nonce, data })
}
}

Legitimate Effect Injection Sites

The architectural compliance checker allows direct impure function usage only in these specific locations:

1. Effect Handler Implementations (aura-effects)

#![allow(unused)]
fn main() {
// ALLOWED: Production effect implementations
impl PhysicalTimeEffects for PhysicalTimeHandler {
    async fn physical_time(&self) -> Result<PhysicalTime, TimeError> {
        // OK: This IS the effect implementation
        let now = SystemTime::now().duration_since(UNIX_EPOCH)?;
        Ok(PhysicalTime::from_ms(now.as_millis() as u64))
    }
}

impl RandomCoreEffects for RealRandomHandler {
    async fn random_bytes(&self, len: usize) -> Vec<u8> {
        let mut bytes = vec![0u8; len];
        rand::thread_rng().fill_bytes(&mut bytes);  // OK: Legitimate OS randomness source
        bytes
    }
}
}

2. Runtime Effect Assembly (runtime/effects.rs)

#![allow(unused)]
fn main() {
// ALLOWED: Effect system bootstrapping
pub fn create_production_effects() -> AuraEffectSystem {
    AuraEffectSystemBuilder::new()
        .with_handler(Arc::new(PhysicalTimeHandler::new()))
        .with_handler(Arc::new(RealRandomHandler::new())) // OK: Assembly point
        .build()
}
}

3. Pure Functions (aura-core::hash)

#![allow(unused)]
fn main() {
// ALLOWED: Deterministic, pure operations
pub fn hash(data: &[u8]) -> [u8; 32] {
    blake3::hash(data).into()  // OK: Pure function, no external state
}
}

Exemption Rationale

Why these exemptions are architecturally sound:

  1. Effect implementations MUST access the actual system - that's their purpose
  2. Runtime assembly is the controlled injection point where production vs. mock effects are chosen
  3. Pure functions are deterministic regardless of when/where they're called

Why broad exemptions are dangerous:

  • Crate-level exemptions (aura-agent, aura-protocol, aura-guards, aura-consensus, aura-amp, aura-anti-entropy) would allow business logic to bypass effects
  • This breaks simulation determinism and WASM compatibility
  • Makes testing unreliable by introducing hidden external dependencies

Effect System Usage Patterns

Correct: Infrastructure Effects in aura-effects

#![allow(unused)]
fn main() {
// File: crates/aura-effects/src/transport/tcp.rs
pub struct TcpTransportHandler {
    config: TransportConfig,
}

impl TcpTransportHandler {
    pub async fn connect(&self, addr: TransportSocketAddr) -> TransportResult<TransportConnection> {
        let stream = TcpStream::connect(*addr.as_socket_addr()).await?; // OK: Implementation
        // ... connection setup
        Ok(connection)
    }
}
}

Correct: Domain Effects in Domain Crates

#![allow(unused)]
fn main() {
// File: crates/aura-journal/src/effects.rs
pub struct JournalHandler<C: CryptoEffects, S: StorageEffects> {
    crypto: C,
    storage: S,
}

impl<C: CryptoEffects, S: StorageEffects> JournalEffects for JournalHandler<C, S> {
    async fn append_fact(&self, ctx: &EffectContext, fact: Fact) -> Result<()> {
        // Domain validation (pure)
        self.validate_fact_semantics(&fact)?;
        
        // Infrastructure effects for impure operations
        let signature = self.crypto.sign(&fact.hash()).await?;
        self.storage.write_chunk(&entry.id(), &entry.encode()).await?;
        
        Ok(())
    }
}
}

Violation: Direct impure access in domain logic

#![allow(unused)]
fn main() {
// File: crates/aura-core/src/crypto/tree_signing.rs  
pub async fn start_frost_ceremony() -> Result<()> {
    let start_time = SystemTime::now(); // VIOLATION: Should use TimeEffects
    let session_id = Uuid::new_v4();    // VIOLATION: Should use RandomEffects
    
    // This breaks deterministic simulation!
    ceremony_with_timing(start_time, session_id).await
}
}

Context Propagation Requirements

All async operations must propagate EffectContext:

#![allow(unused)]
fn main() {
// CORRECT: Explicit context propagation
async fn process_request<T: AllEffects>(
    ctx: &EffectContext,  // Required for tracing/correlation
    effects: &T,
    request: Request,
) -> Result<Response> {
    let start = effects.current_time().await;
    
    // Context flows through the call chain
    let result = process_business_logic(ctx, effects, request.data).await?;
    
    let duration = effects.current_time().await.duration_since(start)?;
    tracing::info!(
        request_id = %ctx.request_id,
        duration_ms = duration.as_millis(),
        "Request processed"
    );
    
    Ok(result)
}
}

Mock Testing Pattern

Tests use controllable mock effects:

#![allow(unused)]
fn main() {
// File: tests/integration/frost_test.rs
#[tokio::test]
async fn test_frost_ceremony_timing() {
    // Controllable time for deterministic tests
    let mock_time = SimulatedTimeHandler::new();
    mock_time.set_time(PhysicalTime::from_ms(1000_000));

    let effects = TestEffectSystem::new()
        .with_time(mock_time)
        .with_random(MockRandomHandler::deterministic())
        .build();

    let ctx = EffectContext::test();

    // Test runs deterministically regardless of wall-clock time
    let result = start_frost_ceremony(&ctx, &effects).await;
    assert!(result.is_ok());
}
}

WASM Compatibility Guidelines

Forbidden in all crates (except effect implementations):

  • std::thread - No OS threads in WASM
  • std::fs - No filesystem in browsers
  • SystemTime::now() - Time must be injected
  • rand::thread_rng() - Randomness must be controllable
  • Blocking operations - Everything must be async

Required patterns:

  • Async/await for all I/O operations
  • Effect trait injection for all impure operations
  • Explicit context propagation through call chains
  • Builder patterns for initialization with async setup

Compliance Checking

The just check-arch command validates these principles by:

  1. Scanning for direct impure usage: Detects SystemTime::now, thread_rng(), std::fs::, etc.
  2. Enforcing precise exemptions: Only allows usage in impl.*Effects, runtime/effects.rs
  3. Context propagation validation: Warns about async functions without EffectContext
  4. Global state detection: Catches lazy_static, Mutex<static> anti-patterns

Run before every commit to maintain architectural compliance and simulation determinism.

Serialization Policy

Aura uses DAG-CBOR as the canonical serialization format for:

  • Wire protocols: Network messages between peers
  • Facts: CRDT state, journal entries, attestations
  • Cryptographic commitments: Content-addressable hashes (determinism required)

Canonical Module

All serialization should use aura_core::util::serialization:

#![allow(unused)]
fn main() {
use aura_core::util::serialization::{to_vec, from_slice, hash_canonical};
use aura_core::util::serialization::{VersionedMessage, SemanticVersion};

// Serialize to DAG-CBOR
let bytes = to_vec(&value)?;

// Deserialize from DAG-CBOR
let value: MyType = from_slice(&bytes)?;

// Content-addressable hash (deterministic)
let hash = hash_canonical(&value)?;
}

Why DAG-CBOR?

  1. Deterministic canonical encoding: Required for FROST threshold signatures where all parties must produce identical bytes
  2. Content-addressable: IPLD compatibility for content hashing and Merkle trees
  3. Forward/backward compatible: Semantic versioning support via VersionedMessage<T>
  4. Efficient binary encoding: Better than JSON, comparable to bincode

Allowed Alternatives

FormatUse CaseExample
serde_jsonUser-facing config files.aura/config.json
serde_jsonDebug output and loggingtracing spans
serde_jsonDynamic metadataHashMap<String, Value>

Versioned Facts Pattern

All fact types should use the versioned serialization pattern:

#![allow(unused)]
fn main() {
use aura_core::util::serialization::{to_vec, from_slice, SerializationError};

const CURRENT_VERSION: u32 = 1;

impl MyFact {
    pub fn to_bytes(&self) -> Result<Vec<u8>, SerializationError> {
        to_vec(self)
    }

    pub fn from_bytes(bytes: &[u8]) -> Result<Self, SerializationError> {
        // Try DAG-CBOR first, fall back to JSON for compatibility
        from_slice(bytes).or_else(|_| {
            serde_json::from_slice(bytes)
                .map_err(|e| SerializationError::Deserialization(e.to_string()))
        })
    }
}
}

Enforcement

The just check-arch --serialization command validates:

  • Wire protocol files use canonical serialization
  • Facts files use versioned serialization

Invariant Traceability

This section indexes invariants across Aura and maps them to enforcement loci. Invariant specifications live in crate ARCHITECTURE.md files. Contracts in Theoretical Model, Privacy and Information Flow Contract, and Distributed Systems Contract define the cross-crate safety model.

Canonical Naming

Use InvariantXxx names in proofs and tests. Use prose aliases for readability when needed. When both forms appear, introduce the alias once and then reference the canonical name.

Examples:

  • Charge-Before-Send maps to InvariantSentMessagesHaveFacts and InvariantFlowBudgetNonNegative.
  • Context Isolation maps to InvariantContextIsolation.
  • Secure Channel Lifecycle maps to InvariantReceiptValidityWindow and InvariantCrossEpochReplayPrevention.

Use shared terminology from Theoretical Model:

  • Role terms: Member, Participant, Moderator
  • Access terms: AccessLevel with Full, Partial, Limited
  • Storage/pinning terms: Shared Storage, allocation, and pinned facts

Core Invariant Index

AliasCanonical Name(s)Primary EnforcementRelated Contracts
Charge-Before-SendInvariantSentMessagesHaveFacts, InvariantFlowBudgetNonNegativecrates/aura-guards/ARCHITECTURE.mdPrivacy and Information Flow Contract, Distributed Systems Contract
CRDT ConvergenceInvariantCRDTConvergencecrates/aura-journal/ARCHITECTURE.mdTheoretical Model, Distributed Systems Contract
Context IsolationInvariantContextIsolationcrates/aura-core/ARCHITECTURE.mdTheoretical Model, Privacy and Information Flow Contract, Distributed Systems Contract
Secure Channel LifecycleInvariantSecureChannelLifecycle, InvariantReceiptValidityWindow, InvariantCrossEpochReplayPreventioncrates/aura-rendezvous/ARCHITECTURE.mdPrivacy and Information Flow Contract, Distributed Systems Contract
Authority Tree Topology and Commitment CoherenceInvariantAuthorityTreeTopologyCommitmentCoherencecrates/aura-journal/ARCHITECTURE.mdTheoretical Model, Distributed Systems Contract

Distributed Contract Invariants

The distributed and privacy contracts define additional canonical names used by proofs and conformance tests:

  • InvariantUniqueCommitPerInstance
  • InvariantCommitRequiresThreshold
  • InvariantEquivocatorsExcluded
  • InvariantNonceUnique
  • InvariantSequenceMonotonic
  • InvariantReceiptValidityWindow
  • InvariantCrossEpochReplayPrevention
  • InvariantVectorClockConsistent
  • InvariantHonestMajorityCanCommit
  • InvariantCompromisedNoncesExcluded

When a crate enforces one of these invariants, record the same canonical name in that crate's ARCHITECTURE.md.

Traceability Matrix

This matrix provides a single cross-reference for contract names, owning crate docs, and proof/test artifacts.

Canonical NameCrate Architecture SpecProof/Test Artifact
InvariantSentMessagesHaveFactscrates/aura-guards/ARCHITECTURE.mdverification/quint/transport.qnt
InvariantFlowBudgetNonNegativecrates/aura-guards/ARCHITECTURE.mdverification/quint/transport.qnt
InvariantContextIsolationcrates/aura-core/ARCHITECTURE.md, crates/aura-transport/ARCHITECTURE.mdverification/quint/transport.qnt
InvariantSequenceMonotoniccrates/aura-transport/ARCHITECTURE.mdverification/quint/transport.qnt
InvariantReceiptValidityWindowcrates/aura-rendezvous/ARCHITECTURE.mdverification/quint/epochs.qnt
InvariantCrossEpochReplayPreventioncrates/aura-rendezvous/ARCHITECTURE.mdverification/quint/epochs.qnt
InvariantNonceUniquecrates/aura-journal/ARCHITECTURE.mdverification/quint/journal/core.qnt
InvariantVectorClockConsistentcrates/aura-anti-entropy/ARCHITECTURE.mdverification/quint/journal/anti_entropy.qnt
InvariantUniqueCommitPerInstancecrates/aura-consensus/ARCHITECTURE.mdverification/quint/consensus/core.qnt, verification/lean/Aura/Proofs/Consensus/Agreement.lean
InvariantCommitRequiresThresholdcrates/aura-consensus/ARCHITECTURE.mdverification/quint/consensus/core.qnt, verification/lean/Aura/Proofs/Consensus/Validity.lean
InvariantEquivocatorsExcludedcrates/aura-consensus/ARCHITECTURE.mdverification/quint/consensus/core.qnt, verification/lean/Aura/Proofs/Consensus/Adversary.lean
InvariantHonestMajorityCanCommitcrates/aura-consensus/ARCHITECTURE.mdverification/quint/consensus/adversary.qnt, verification/lean/Aura/Proofs/Consensus/Adversary.lean
InvariantCompromisedNoncesExcludedcrates/aura-consensus/ARCHITECTURE.mdverification/quint/consensus/adversary.qnt

Use just check-invariants to validate system invariants across the workspace.