AI Operations & Orchestration Ecosystem

The hub-and-spoke multi-agent system I built and run my own work on — RAG, typed agents, and hook-enforced governance

7 agentstyped roles coordinating across 5 project spokes, in daily use

Architecture

Problem

Working across multiple personal projects with AI agents kept hitting the same anti-patterns: context loss between sessions, knowledge that lived in one project couldn't be reused in another without manual transfer, and ad-hoc decisions skipped the steps that prevent expensive mistakes (verify before edit, run tests before claiming done, audit a change for side effects). Each project was an island, and the coordination overhead of running multi-agent work across them was eating real time.

Discovery

Tracked recurring pain across sessions: context loss, duplicate work, unverified decisions, RAG-vs-direct-read inefficiency. Researched hub-and-spoke patterns and current multi-agent orchestration literature. Mapped which protocols belonged at the hub (cross-project knowledge, decision frameworks, agent role definitions, tools) versus which belonged at each spoke (project-specific code, sessions, intelligence). The core problem turned out to be knowledge persistence and protocol enforcement, not raw agent capability.

Solution

Built a hub-and-spoke architecture: central knowledge hub stores patterns, tools, ADRs, and typed-role definitions; spoke projects sync from the hub and contribute back. Production RAG v2 pipeline with hybrid retrieval (vector + BM25 keyword fusion), cross-encoder reranking (BAAI/bge-reranker-base), semantic code chunking via tree-sitter, and configurable index categories per spoke. FastAPI endpoints for cross-project search. Hook-enforcement layer catches missed startup protocols, missing RAG-first lookups, and absent session documentation with file-system-verifiable markers (not advisory reminders). Native agent teams with 7 typed roles (orchestrator, implementer, qa-agent, security-agent, researcher, reviewer, data-engineer) scoped to the work each does. OODA decision framework wraps every significant change so agents gather context before acting.

Decisions & Tradeoffs

Hub-and-spoke over monolith — each project stays autonomous while sharing knowledge through a central sync mechanism; no spoke becomes a critical path for another
Hybrid retrieval (vector + BM25) over pure vector — vector search missed exact matches like task IDs, port numbers, and config keys; BM25 fusion catches them
Cross-encoder reranking on top of fusion — reorders combined results for semantic quality with configurable top-N and similarity cutoff guardrails
RAG-first policy enforced by hook over advisory reminder — agents that try to read large files without a prior RAG query get blocked at the tool layer; cut token usage on lookup work by an order of magnitude
Typed agent roles with scoped permissions over a single general-purpose agent — separation of concerns at the agent level mirrors the team structure I'd build in any organization
OODA framework over ad-hoc decisions — mandatory Observe-Orient-Decide-Act loop keeps agents from acting before they understand the system they're touching

Outcomes

Production RAG v2 pipeline with hybrid retrieval, cross-encoder reranking, and semantic code chunking, in daily use
Cross-project search enables querying any spoke's indexes from any other spoke
Hook-enforcement system ensures protocol compliance with file-system-verifiable markers
RAG-first lookups cut token usage on lookup work by an order of magnitude compared to reading full files
7 typed agent roles via native agent teams: orchestrator, implementer, qa-agent, security-agent, researcher, reviewer, data-engineer
OODA decision framework ensures all significant changes follow Observe-Orient-Decide-Act loop

Built with

PythonTypeScriptLlamaIndexFastAPIn8nDocker