Skip to main content
This page traces a workflow run from trigger through completion. It covers every phase the interpreter walks through, who owns each phase, and what happens when things go wrong.

⚙️ Execution Phases

WORKFLOW EXECUTION — 8 PHASES

1
Run Start· Convex
workflowRuns.create

Creates workflowRuns record with pinned YAML, version, gateway, role bindings, status “running”.

2
Parse & Validate· Convex
interpreter.ts

Parses YAML, validates against Zod schema: unique step IDs, valid deps, non-empty commands.

3
DAG Resolution· Convex
resolveExecutionLayers()

Topological sort on step deps. Produces ordered execution layers — steps in a layer run in parallel.

4
Governance Pipeline· Convex
evaluateGovernance()

10-gate pipeline: health, agent, concurrency, rate, budget x2, trust x2, policy, approval.

passblockhold
if passed
5
Runtime Execution· Gateway (via Bridge)
dispatch.ts → Bridge

Convex dispatches to Bridge. Bridge delegates to adapter. Results flow back via callback.

6
Variable Scope & Output· Convex
coerceOutputs()

Coerce outputs, validate against schema, store for downstream $step_id.stdout references.

7
Failure Handling· Convex
on_failure policy

on_failure policy: fail (stop run), continue (skip), retry (3x backoff), retry_once_then_escalate.

8
Completion· Convex
workflowRuns status

All steps resolved (completed, skipped, or failed with continue). Run status set to completed or failed.

📋 Phase Details

1. Run Start

A workflow run is created through one of two paths:
  • Manual trigger via an operator interface (public mutation)
  • Programmatic trigger via the Platform API or another Convex function
The run record captures:
  • The pinned YAML (immutable for the lifetime of the run)
  • Workflow version, gateway ID, role bindings from the active deployment
  • Runtime args passed by the caller
  • Initial status "running" and a startedAt timestamp
The Convex durable workflow ID is persisted back to the run record so that completion events can route to the correct interpreter instance.

2. Parse & Validate

Validation covers three categories:
CategoryChecks
StructuralUnique step IDs, non-empty steps array, snake_case naming
Cross-referenceEvery depends_on target exists, every stdin reference points to a valid step, every phase reference maps to a declared phase
CompletenessEvery non-approval, non-sub-workflow, non-library step has a command
If the workflow references library steps (library_step field), the interpreter resolves them from the stepLibrary table and merges defaults — the workflow definition always wins for fields it explicitly sets.

3. DAG Resolution

The resolveExecutionLayers() function runs a topological sort variant:
  1. Infer dependencies from both explicit depends_on arrays and implicit stdin references (pattern: $step_id.stdout)
  2. Iterate until all steps are placed in layers. Each pass collects steps whose dependencies are all resolved into a new layer.
  3. Detect cycles — if a pass produces zero new resolvable steps while steps remain, the engine logs a circular-dependency error and forces remaining steps into a final layer.
Steps within a layer can execute concurrently. The number of layers determines the minimum sequential depth of the run.

4. Governance Pipeline

Every step dispatch runs through the full governance pipeline. See Governance Pipeline for the complete gate-by-gate breakdown. The three possible outcomes:
DispositionMeaningEffect
passAll gates passedStep is dispatched to Bridge
blockA gate rejected the dispatchStep is marked failed with the gate’s error code
holdA policy requires approvalAn approval record is created; the interpreter sleeps until resolved

5. Runtime Execution

Convex dispatches via dispatch.ts:
  1. Resolve the adapter for the target gateway
  2. Build the dispatch payload with command, stdin, context, timeout, and callback URL
  3. POST to Bridge /api/v2/dispatch
  4. Bridge delegates to adapter.execute() with the normalized ExecutionRequest
  5. The adapter runs the task on the gateway runtime
  6. Bridge delivers the ExecutionResult to the Convex callback URL
Dispatch is fire-and-forget from Convex’s perspective. The interpreter moves on to dispatch other ready steps and waits for completion events.

6. Variable Scope

Step outputs are accessible to downstream steps via the $step_id.stdout reference pattern. The interpreter:
  1. Coerces outputs to expected types using coerceOutputs()
  2. Validates outputs against the step’s declared output schema using validateOutputs()
  3. Evaluates success criteria if the step defines them using evaluateSuccessCriteria()
  4. Stores outputs for downstream reference
Condition expressions on steps are evaluated against the accumulated output scope.

7. Failure Handling

Each step declares an on_failure policy:
PolicyBehavior
failStop the entire run immediately
continueMark the step as skipped, advance the DAG
retryRetry up to 3 times with exponential backoff (1s, 2s, 4s)
retry_once_then_escalateRetry once, then create a escalation approval
For retryable governance gate failures (e.g., agent_busy), the interpreter also retries with backoff before applying the step’s failure policy.

8. Completion

The interpreter marks the run as completed when all steps are resolved. A step is “resolved” if it is in one of these terminal states: completed, skipped, failed (with continue policy), or cancelled. If the run completes with any step failures, the run status reflects the worst outcome.

🔐 Ownership Matrix

PhaseOwnerKey file
Run startConvexconvex/workflowRuns.ts
Parse & validateConvexconvex/interpreter.ts, packages/shared/src/lobsterx-schema.ts
DAG resolutionConvexconvex/interpreter.ts
GovernanceConvexconvex/lib/governanceEngine.ts, convex/lib/gates.ts
Runtime executionGateway (via Bridge)convex/dispatch.ts, packages/bridge/src/server.ts
Variable scopeConvexconvex/validation.ts
Failure handlingConvexconvex/interpreter.ts
CompletionConvexconvex/interpreter.ts, convex/workflowRuns.ts

🛡️ Execution Hardening

Two cron jobs protect against stalled executions:
CronFrequencyPurpose
Workflow watchdogEvery 2 minutesDetects runs where all steps resolved but the interpreter stalled. Resumes the workflow.
Step timeout enforcerEvery 60 secondsChecks running steps against their timeout. Marks timed-out steps as failed and applies their on_failure policy.
Step timeouts are enforced by Convex, not by the gateway. Even if a gateway hangs indefinitely, Convex will detect the timeout and fail the step.
See also: Governance Pipeline | Approval Lifecycle | Convex Engine