Supervisor System - Umber Architecture Demo

Supervisor Agent Architecture

The Supervisor is the single entry point for all natural language messages. It routes, plans, coordinates, and synthesizes multi-agent workflows.

👤

User Message

"Plan my week"

→

🛣

AgentRouter

Backend entry point

→

🎯

Supervisor v3

Orchestrator

⚡

Fast Mode

~500ms conversational

or

🔄

Full Mode

Multi-agent orchestration

~500ms

Fast Mode Latency

3

Parallel Classifiers

5

Eval Dimensions

0.8

Confidence Threshold

💡 LLM-First, No Heuristics

The Supervisor uses LLM-based semantic analysis instead of keyword matching. This avoids brittle rules that break on edge cases — like "hi how are you doing" being misrouted to task-agent because it contains "do".

🗂

What the Supervisor Does

End-to-end request lifecycle

🎯 Routes Requests

Fast mode vs full orchestration

🧠 Classifies Intent

Task, tracking, routine detection

🎭 Coordinates Agents

Parallel and sequential workflows

✅ Evaluates Plans

Quality gates before execution

💬 Synthesizes Responses

Combines multi-agent outputs

📊 Manages Context

Persona, memory, conversation

Fast Mode vs Full Mode

The FastModeDecisionService uses an LLM to semantically classify messages and route them to the appropriate execution path.

Fast Mode Conversational

~500-800ms

Direct to general-agent

✓ Greetings ("hi", "thanks")
✓ Chitchat and small talk
✓ Simple questions
✓ Follow-up clarifications
✗ Skip planning phase
✗ Skip evaluation
✗ Skip multi-agent coordination

Full Mode Orchestration

~2-10s

Multi-agent workflows

✓ Action requests ("create", "schedule")
✓ Multi-step tasks
✓ Cross-domain queries
✓ Complex analysis
✓ Intent classification
✓ Plan evaluation
✓ Response synthesis

LLM Decision

FastModeDecisionService.decide(message)

Analyzes: message type, tone, action verbs, complexity signals

mode = 'fast'

Fast Path

Direct dispatch to general-agent

No planning, no evaluation, immediate response

mode = 'full'

Full Path

Run parallel intent classifiers

Task, Tracking, Routine → then orchestrate if needed

interface FastModeDecision {
  mode: 'fast' | 'full';
  confidence: number;           // 0.0 - 1.0
  reasoning: string;            // Brief explanation
  signals: {
    messageType: 'greeting' | 'chitchat' | 'query' | 'action' | 'multi_step';
    tone: 'casual' | 'formal' | 'urgent';
    hasActionVerbs: boolean;
  };
}

Intent Classification

When full mode is triggered, three LLM-based classifiers run in parallel (~600ms) to detect high-confidence intents for fast-track handling.

📋

TaskIntentClassifier

Detects decomposable goals and simple tasks

≥ 0.8 confidence

"Learn Spanish" → decompose_task
"Add milk to list" → simple_task

💰

TrackingIntentClassifier

Detects expense, investment, guest tracking

≥ 0.8 confidence

"Track my Airbnb expenses" → tracking_create
"I spent $150 on supplies" → tracking_entry

🔄

RoutineIntentClassifier

Detects routine completion, skip, query

≥ 0.7 confidence

"Got my hair cut today, $32" → routine_completion
Auto-extracts: cost=32

⚡ Parallel Execution

All three classifiers run concurrently using Promise.allSettled(). Total latency is ~600ms regardless of how many classifiers are enabled. High-confidence matches bypass the full orchestration pipeline.

🎯

Confidence Thresholds

When to fast-track vs confirm vs fallback

Task Decomposition

≥ 0.8

Simple Task

≥ 0.8

Tracking Intent

≥ 0.8

Routine (auto-complete)

≥ 0.85

Routine (confirm)

0.7-0.84

// Parallel execution - all classifiers run concurrently
const [taskResult, trackingResult, routineResult] = await Promise.allSettled([
  this.taskIntentClassifierService.classify(message, context),
  this.trackingIntentClassifierService.classify(message, context),
  this.routineIntentClassifierService.classify(message, context),
]);

// Check for high-confidence matches
if (taskResult.intent === 'decompose_task' && taskResult.confidence >= 0.8) {
  return showTaskConfirmationCard(taskResult.echo);
}

Multi-Agent Orchestration

For complex requests, the Supervisor coordinates multiple agents through sequential and parallel workflow phases.

👥

Agent Pool

55+ specialists available for orchestration

🎯 Core & Orchestration

supervisor general-agent planner planning-agent reviewer goal-contract self-awareness

✅ Task & Project

task epic-agent task-ordering project-splitter decomposition-evaluator complexity-classifier smartlist

🌅 Briefs & Rituals

daily-brief morning-brief nightly-brief weekly-brief signature studio-today-hero

🧠 Memory & Context

memory memory-opportunity journal-synthesizer

📅 Calendar & Scheduling

calendar scheduling-agent

📧 Email

email email-extraction email-triage

✈ Travel & Mobility

travel-integrator mobility-agent disney-planner weather-agent

💚 Health & Wellness

health healthcare-agent wellness balance food-nutrition

🏠 Lifestyle & Services

entertainment-agent food-delivery-agent home-services-agent shopping-specialist finance-manager

💡 Productivity & Creative

productivity brainstorm-agent advisor ambient-designer

📚 Learning

language-learning spanish-beginner

💬 UI & Communication

hey-umber-planner card-ui table-card-titler

🗄 Data & Integration

analytics data-validator integration external-api oauth tool-platform

Workflow Phases

Example: "Plan my week with tasks and meetings"

1 Context Gathering Sequential

Fetch relevant memories and user preferences

memory-agent

2 Domain Analysis Parallel

Query calendar and task systems simultaneously

calendar-agent task-agent

3 Integration Sequential

Synthesize results into cohesive response

general-agent

🔗 Collaboration Criteria

Collaborate when: Request spans multiple domains requiring shared context, results from one agent depend on another, user wants synthesized view across sources, or coordinated actions must happen together.

Don't collaborate when: Single domain request, independent lookups, or simple Q&A exchanges.

// Workflow phase execution
for (const phase of phases) {
  const tasks = await this.buildPhaseTasks(request, phase, analysis);

  // Execute parallel or sequential based on phase.type
  const results = await this.executionCoordinator.executeTasks(
    tasks,
    phase.type === 'parallel'  // Run in parallel?
  );

  // Emit Canvas deltas for real-time visualization
  void this.emitCanvasDelta(correlationId, 'HANDOFF',
    `Bringing in ${task.agentType}...`);
}

Plan Evaluation

Before executing collaborative workflows, plans are scored on 5 dimensions. Low scores trigger replanning with reflexion-style feedback.

Overall Score

0.87

0.92

Feasibility

25% weight

0.88

Completeness

25% weight

0.95

Safety

25% weight

0.78

Efficiency

15% weight

0.85

Clarity

10% weight

APPROVE Proceed with execution

🚦

Evaluation Thresholds

Decision outcomes based on score

≥ 0.8 APPROVE — Execute plan immediately

0.7-0.79 APPROVE_WITH_WARNINGS — Log warnings, proceed

< 0.7 REJECT — Trigger replanning (max 3 attempts)

🔄 Replanning with Reflexion

If a plan is rejected, the ReplanningService uses reflexion-style feedback:

Pass evaluation feedback to the planner
Generate revised plan addressing issues
Re-evaluate (max 3 attempts, 30s timeout)

Historical insights from past evaluations inform future planning — low-performing agents are filtered out.

interface PlanEvaluationResult {
  decision: 'APPROVE' | 'APPROVE_WITH_WARNINGS' | 'REJECT';
  overallScore: number;  // 0-1
  dimensions: {
    feasibility: number;   // Can it succeed?
    completeness: number;  // All steps present?
    safety: number;        // Any risky actions?
    efficiency: number;    // Appropriate complexity?
    clarity: number;       // Unambiguous steps?
  };
  warnings: string[];
  blockers: string[];
  suggestions: string[];
}

Response Synthesis

After agents complete their tasks, the Supervisor synthesizes outputs into a coherent, user-facing response.

📅 calendar-agent: "3 meetings tomorrow"

✅ task-agent: "5 tasks due this week"

🧠 memory-agent: "Prefers morning focus time"

→

✨

Synthesized Response

Cohesive, personalized answer

📦

Synthesis Strategies

Different approaches for different scenarios

Sequential

Combine results in execution order

Parallel Synthesis

LLM merges concurrent results

Conversation Continuation

Context-aware follow-up

Hierarchical

Parent-child agent results

🏆

Goal Contract Assessment

Did we achieve what the user wanted?

Before execution, the Supervisor captures a "goal contract" with the user's intent. After synthesis, the ReviewerAgent assesses whether the goal was achieved.

Goal Statement

"Plan my week with tasks and meetings"

Assessment

Goal achieved: calendar + tasks integrated

interface SynthesizedResponse {
  answer: string;              // Main response text
  followUps?: string[];        // Suggested next actions
  artifacts?: Record<string, unknown>;  // Canvas, tables, etc.
  metadata?: {
    agentsInvolved: string[];
    executionTimeMs: number;
    phases: PhaseExecutionSummary[];
  };
}

🛡 Quality Controller

Before returning responses, the Supervisor validates quality. If validation fails, it triggers retry or fallback chains. Circuit breakers protect against cascading failures when agents are overloaded (10 failures to open, 2 successes to close).