Skip to main content
The Observe section is the operational heart of Burgundy. It combines real-time event monitoring with an operations center for triage, execution traces for debugging, AI-powered analysis for diagnostics, and operator notes for institutional context.

Operations Center

The Operations Center (/ops) surfaces items requiring operator action, prioritized by severity. Failed runs, budget breaches, gateway outages, and escalated approvals all appear here.
ActionHow
TriageItems sorted by severity (critical first, then warning, then info)
AcknowledgeClick to suppress an item temporarily. Reappears if the condition persists
Add notesAttach context for other operators investigating the same issue
NavigateEach item links directly to the source resource (run, agent, gateway)

Attention Item Types

Critical

Gateway offline, infrastructure failures — affect all agents on the gateway

Warning

Run failures, budget pauses, gateway degradation, policy violations

Info

Aging approvals, unclassified failures

Correlated Changes

6-hour lookback for deployments, config changes, and policy updates that may explain the issue

Event Timeline

The event timeline (/observe) provides a unified view of all platform events. It merges Convex platform events with Bridge SSE into a single filterable stream.
  • Filter by category — executions, governance decisions, lifecycle transitions, security events
  • Filter by time range — 5m, 15m, 1h, 24h, or all
  • Filter by resource — scope to a specific run, agent, or gateway
  • Filter by actor — see all actions by a specific user or agent
  • Search — free-text search across event payloads
Events link directly to their source resource. Click any event to see the full detail payload.

Traces

Step-level execution traces for debugging workflow runs. For each step, the trace shows:
DetailDescription
Timing breakdownQueued, started, completed timestamps with duration
Conversation logThe agent’s full message transcript for that step
Tool callsEvery tool invocation with arguments and results
OutputsThe step’s produced outputs and artifacts
Failure detailError message and stack trace for failed steps
Traces are essential for diagnosing why a step failed, took too long, or produced unexpected results.

Copilot Analysis

AI-powered run analysis for failed or anomalous runs. When a run fails or exhibits unusual behavior, trigger Copilot analysis from the run detail page. Copilot examines the execution trace — step sequence, tool calls, errors, timing — and suggests root causes. Analysis runs in a dedicated thread and produces a structured report with findings and suggested next steps. Useful when the failure isn’t obvious from the trace alone.

Operator Notes

Attach notes to any resource — a run, an agent, a factory. Notes are shared across the team for institutional context and persist across sessions. Use notes for:
  • Post-mortem annotations on failed runs
  • Configuration rationale on agent settings
  • Handoff context when transferring operational responsibility