Supervisor Agent Architecture
The Supervisor is the single entry point for all natural language messages. It routes, plans, coordinates, and synthesizes multi-agent workflows.
👤
User Message
"Plan my week"
🛣
AgentRouter
Backend entry point
🎯
Supervisor v3
Orchestrator
Fast Mode
~500ms conversational
or
🔄
Full Mode
Multi-agent orchestration
~500ms
Fast Mode Latency
3
Parallel Classifiers
5
Eval Dimensions
0.8
Confidence Threshold
💡 LLM-First, No Heuristics
The Supervisor uses LLM-based semantic analysis instead of keyword matching. This avoids brittle rules that break on edge cases — like "hi how are you doing" being misrouted to task-agent because it contains "do".
🗂
What the Supervisor Does
End-to-end request lifecycle
🎯 Routes Requests
Fast mode vs full orchestration
🧠 Classifies Intent
Task, tracking, routine detection
🎭 Coordinates Agents
Parallel and sequential workflows
✅ Evaluates Plans
Quality gates before execution
💬 Synthesizes Responses
Combines multi-agent outputs
📊 Manages Context
Persona, memory, conversation
Fast Mode vs Full Mode
The FastModeDecisionService uses an LLM to semantically classify messages and route them to the appropriate execution path.
Fast Mode Conversational
~500-800ms
Direct to general-agent
  • Greetings ("hi", "thanks")
  • Chitchat and small talk
  • Simple questions
  • Follow-up clarifications
  • Skip planning phase
  • Skip evaluation
  • Skip multi-agent coordination
Full Mode Orchestration
~2-10s
Multi-agent workflows
  • Action requests ("create", "schedule")
  • Multi-step tasks
  • Cross-domain queries
  • Complex analysis
  • Intent classification
  • Plan evaluation
  • Response synthesis
LLM Decision
FastModeDecisionService.decide(message)
Analyzes: message type, tone, action verbs, complexity signals
mode = 'fast'
Fast Path
Direct dispatch to general-agent
No planning, no evaluation, immediate response
mode = 'full'
Full Path
Run parallel intent classifiers
Task, Tracking, Routine → then orchestrate if needed
interface FastModeDecision {
  mode: 'fast' | 'full';
  confidence: number;           // 0.0 - 1.0
  reasoning: string;            // Brief explanation
  signals: {
    messageType: 'greeting' | 'chitchat' | 'query' | 'action' | 'multi_step';
    tone: 'casual' | 'formal' | 'urgent';
    hasActionVerbs: boolean;
  };
}
Intent Classification
When full mode is triggered, three LLM-based classifiers run in parallel (~600ms) to detect high-confidence intents for fast-track handling.
📋
TaskIntentClassifier
Detects decomposable goals and simple tasks
≥ 0.8 confidence
"Learn Spanish" → decompose_task
"Add milk to list" → simple_task
💰
TrackingIntentClassifier
Detects expense, investment, guest tracking
≥ 0.8 confidence
"Track my Airbnb expenses" → tracking_create
"I spent $150 on supplies" → tracking_entry
🔄
RoutineIntentClassifier
Detects routine completion, skip, query
≥ 0.7 confidence
"Got my hair cut today, $32" → routine_completion
Auto-extracts: cost=32
Parallel Execution
All three classifiers run concurrently using Promise.allSettled(). Total latency is ~600ms regardless of how many classifiers are enabled. High-confidence matches bypass the full orchestration pipeline.
🎯
Confidence Thresholds
When to fast-track vs confirm vs fallback
Task Decomposition
≥ 0.8
Simple Task
≥ 0.8
Tracking Intent
≥ 0.8
Routine (auto-complete)
≥ 0.85
Routine (confirm)
0.7-0.84
// Parallel execution - all classifiers run concurrently
const [taskResult, trackingResult, routineResult] = await Promise.allSettled([
  this.taskIntentClassifierService.classify(message, context),
  this.trackingIntentClassifierService.classify(message, context),
  this.routineIntentClassifierService.classify(message, context),
]);

// Check for high-confidence matches
if (taskResult.intent === 'decompose_task' && taskResult.confidence >= 0.8) {
  return showTaskConfirmationCard(taskResult.echo);
}
Multi-Agent Orchestration
For complex requests, the Supervisor coordinates multiple agents through sequential and parallel workflow phases.
👥
Agent Pool
55+ specialists available for orchestration
🎯 Core & Orchestration
supervisor general-agent planner planning-agent reviewer goal-contract self-awareness
✅ Task & Project
task epic-agent task-ordering project-splitter decomposition-evaluator complexity-classifier smartlist
🌅 Briefs & Rituals
daily-brief morning-brief nightly-brief weekly-brief signature studio-today-hero
🧠 Memory & Context
memory memory-opportunity journal-synthesizer
📅 Calendar & Scheduling
calendar scheduling-agent
📧 Email
email email-extraction email-triage
✈ Travel & Mobility
travel-integrator mobility-agent disney-planner weather-agent
💚 Health & Wellness
health healthcare-agent wellness balance food-nutrition
🏠 Lifestyle & Services
entertainment-agent food-delivery-agent home-services-agent shopping-specialist finance-manager
💡 Productivity & Creative
productivity brainstorm-agent advisor ambient-designer
📚 Learning
language-learning spanish-beginner
💬 UI & Communication
hey-umber-planner card-ui table-card-titler
🗄 Data & Integration
analytics data-validator integration external-api oauth tool-platform
Workflow Phases
Example: "Plan my week with tasks and meetings"
1 Context Gathering Sequential
Fetch relevant memories and user preferences
memory-agent
2 Domain Analysis Parallel
Query calendar and task systems simultaneously
calendar-agent task-agent
3 Integration Sequential
Synthesize results into cohesive response
general-agent
🔗 Collaboration Criteria
Collaborate when: Request spans multiple domains requiring shared context, results from one agent depend on another, user wants synthesized view across sources, or coordinated actions must happen together.

Don't collaborate when: Single domain request, independent lookups, or simple Q&A exchanges.
// Workflow phase execution
for (const phase of phases) {
  const tasks = await this.buildPhaseTasks(request, phase, analysis);

  // Execute parallel or sequential based on phase.type
  const results = await this.executionCoordinator.executeTasks(
    tasks,
    phase.type === 'parallel'  // Run in parallel?
  );

  // Emit Canvas deltas for real-time visualization
  void this.emitCanvasDelta(correlationId, 'HANDOFF',
    `Bringing in ${task.agentType}...`);
}
Plan Evaluation
Before executing collaborative workflows, plans are scored on 5 dimensions. Low scores trigger replanning with reflexion-style feedback.
Overall Score
0.87
0.92
Feasibility
25% weight
0.88
Completeness
25% weight
0.95
Safety
25% weight
0.78
Efficiency
15% weight
0.85
Clarity
10% weight
APPROVE Proceed with execution
🚦
Evaluation Thresholds
Decision outcomes based on score
≥ 0.8 APPROVE — Execute plan immediately
0.7-0.79 APPROVE_WITH_WARNINGS — Log warnings, proceed
< 0.7 REJECT — Trigger replanning (max 3 attempts)
🔄 Replanning with Reflexion
If a plan is rejected, the ReplanningService uses reflexion-style feedback:
  1. Pass evaluation feedback to the planner
  2. Generate revised plan addressing issues
  3. Re-evaluate (max 3 attempts, 30s timeout)
Historical insights from past evaluations inform future planning — low-performing agents are filtered out.
interface PlanEvaluationResult {
  decision: 'APPROVE' | 'APPROVE_WITH_WARNINGS' | 'REJECT';
  overallScore: number;  // 0-1
  dimensions: {
    feasibility: number;   // Can it succeed?
    completeness: number;  // All steps present?
    safety: number;        // Any risky actions?
    efficiency: number;    // Appropriate complexity?
    clarity: number;       // Unambiguous steps?
  };
  warnings: string[];
  blockers: string[];
  suggestions: string[];
}
Response Synthesis
After agents complete their tasks, the Supervisor synthesizes outputs into a coherent, user-facing response.
📅 calendar-agent: "3 meetings tomorrow"
task-agent: "5 tasks due this week"
🧠 memory-agent: "Prefers morning focus time"
Synthesized Response
Cohesive, personalized answer
📦
Synthesis Strategies
Different approaches for different scenarios
Sequential
Combine results in execution order
Parallel Synthesis
LLM merges concurrent results
Conversation Continuation
Context-aware follow-up
Hierarchical
Parent-child agent results
🏆
Goal Contract Assessment
Did we achieve what the user wanted?
Before execution, the Supervisor captures a "goal contract" with the user's intent. After synthesis, the ReviewerAgent assesses whether the goal was achieved.
Goal Statement
"Plan my week with tasks and meetings"
Assessment
Goal achieved: calendar + tasks integrated
interface SynthesizedResponse {
  answer: string;              // Main response text
  followUps?: string[];        // Suggested next actions
  artifacts?: Record<string, unknown>;  // Canvas, tables, etc.
  metadata?: {
    agentsInvolved: string[];
    executionTimeMs: number;
    phases: PhaseExecutionSummary[];
  };
}
🛡 Quality Controller
Before returning responses, the Supervisor validates quality. If validation fails, it triggers retry or fallback chains. Circuit breakers protect against cascading failures when agents are overloaded (10 failures to open, 2 successes to close).