Agent Execution: run / attempt / subscribe / fallback

Agent Execution: run / attempt / subscribe / fallback

This chapter is about the “engine”: turning one request into a traceable runId, with clear retry/fallback behavior.

Key boundaries

  • run: schedules candidate provider/model attempts and aggregates the outcome
  • attempt: performs one real execution (prompt/session/tools/output)
  • subscribe: translates streaming events into deliverable output
  • fallback: decides what is retryable and how to record failures

A realistic end-to-end mental model

Inbound/request → session lane → global lane → attempt → subscribe (streams) → (optional) compaction retry → finalize → (if needed) auth rotation / recovery / model fallback.

Observable baseline:

  • openclaw gateway
  • openclaw logs --follow
  • openclaw dashboard

1) run: orchestration, not execution details

Keep run focused on decisions:

  • two-level queueing (session ordering + global limits)
  • context window guard (block early if too small)
  • resilience sequencing (auth rotation, context recovery, fallback)

2) attempt: one real “transaction” (must be cleanup-safe)

Core behaviors:

  • sanitize/validate/limit history in a fixed order
  • subscribe before prompting (don’t drop events)
  • allow before_agent_start prepend context hooks
  • wait for compaction retry to settle before returning “done”
  • always clean up in finally (unsubscribe, clear active run, dispose session)

3) subscribe: turn low-level events into stable signals

At minimum, keep separate streams for:

  • assistant text deltas
  • tool execution events + summaries
  • compaction lifecycle (to implement waitForCompactionRetry)

4) runs: an active-run registry for steer/abort (with race safety)

  • support steer/abort via an active handle
  • clear with handle matching to avoid deleting a newer run

5) fallback: structured attempts, not “try another model”

  • normalize retryable failures into a stable reason/status/code
  • preserve abort semantics (don’t fallback on user abort)
  • emit structured attempts for observability

Entry points:

Code entry points (optional)

  • src/auto-reply/reply/agent-runner.ts
  • src/agents/pi-embedded-runner/run.ts
  • src/agents/pi-embedded-runner/run/attempt.ts
  • src/agents/pi-embedded-subscribe.ts
  • src/agents/model-fallback.ts

Acceptance checklist

  1. One active run per sessionId; cleanup uses handle matching.
  2. Compaction retries can be awaited; no early “done”.
  3. Abort leaves no active run or leaked subscriptions.
  4. Fallback emits structured attempts and preserves abort semantics.