AI Operations & Orchestration Ecosystem
The hub-and-spoke multi-agent system I built and run my own work on — RAG, typed agents, and hook-enforced governance
Working across multiple personal projects with AI agents kept hitting the same anti-patterns: context loss between sessions, knowledge that lived in one project couldn't be reused in another without manual transfer, and ad-hoc decisions skipped the steps that prevent expensive mistakes (verify before edit, run tests before claiming done, audit a change for side effects). Each project was an island, and the coordination overhead of running multi-agent work across them was eating real time.
Tracked recurring pain across sessions: context loss, duplicate work, unverified decisions, RAG-vs-direct-read inefficiency. Researched hub-and-spoke patterns and current multi-agent orchestration literature. Mapped which protocols belonged at the hub (cross-project knowledge, decision frameworks, agent role definitions, tools) versus which belonged at each spoke (project-specific code, sessions, intelligence). The core problem turned out to be knowledge persistence and protocol enforcement, not raw agent capability.
Built a hub-and-spoke architecture: central knowledge hub stores patterns, tools, ADRs, and typed-role definitions; spoke projects sync from the hub and contribute back. Production RAG v2 pipeline with hybrid retrieval (vector + BM25 keyword fusion), cross-encoder reranking (BAAI/bge-reranker-base), semantic code chunking via tree-sitter, and configurable index categories per spoke. FastAPI endpoints for cross-project search. Hook-enforcement layer catches missed startup protocols, missing RAG-first lookups, and absent session documentation with file-system-verifiable markers (not advisory reminders). Native agent teams with 7 typed roles (orchestrator, implementer, qa-agent, security-agent, researcher, reviewer, data-engineer) scoped to the work each does. OODA decision framework wraps every significant change so agents gather context before acting.
- Hub-and-spoke over monolith — each project stays autonomous while sharing knowledge through a central sync mechanism; no spoke becomes a critical path for another
- Hybrid retrieval (vector + BM25) over pure vector — vector search missed exact matches like task IDs, port numbers, and config keys; BM25 fusion catches them
- Cross-encoder reranking on top of fusion — reorders combined results for semantic quality with configurable top-N and similarity cutoff guardrails
- RAG-first policy enforced by hook over advisory reminder — agents that try to read large files without a prior RAG query get blocked at the tool layer; cut token usage on lookup work by an order of magnitude
- Typed agent roles with scoped permissions over a single general-purpose agent — separation of concerns at the agent level mirrors the team structure I'd build in any organization
- OODA framework over ad-hoc decisions — mandatory Observe-Orient-Decide-Act loop keeps agents from acting before they understand the system they're touching
- Production RAG v2 pipeline with hybrid retrieval, cross-encoder reranking, and semantic code chunking, in daily use
- Cross-project search enables querying any spoke's indexes from any other spoke
- Hook-enforcement system ensures protocol compliance with file-system-verifiable markers
- RAG-first lookups cut token usage on lookup work by an order of magnitude compared to reading full files
- 7 typed agent roles via native agent teams: orchestrator, implementer, qa-agent, security-agent, researcher, reviewer, data-engineer
- OODA decision framework ensures all significant changes follow Observe-Orient-Decide-Act loop