Agent Execution (Implementation): run / attempt / subscribe / active runs

Agent Execution (Implementation): run / attempt / subscribe / active runs

This page is an implementation guide for turning the execution framework into a state machine and observable signals.

Entry points (concept):

Code entry points (optional)

  • src/auto-reply/reply/agent-runner.ts
  • src/agents/pi-embedded-runner/run.ts
  • src/agents/pi-embedded-runner/run/attempt.ts
  • src/agents/pi-embedded-subscribe.ts
  • src/agents/pi-embedded-runner/runs.ts

State diagram (keep these boundaries)

idle → queued(session lane) → queued(global lane) → attempting → streaming → compacting(optional) → completed/failed → idle

Control substates:

  • aborting (user abort / timeout)
  • waiting_compaction_retry (subscription waits for stability)

Three mechanics worth copying

1) Two-level queuing: session ordering + global limiting

If you only have a global queue, you’ll eventually see ordering and cross-talk issues.

2) Active run registry: a minimal control plane for steer/abort

Make races predictable:

  • clear must do handle matching (avoid old finally deleting a new run)
  • don’t accept steer while compacting (avoid transcript inconsistency)

3) Subscription fan-out: text stream / tool stream / compaction stream

Keep tool events separate from assistant deltas to preserve auditability and UI clarity.

Failure modes and troubleshooting

  • ACK but no output: verify event stream in Gateway protocol and trace by runId in Logging.
  • Steer intermittently fails: confirm the run is streaming and not compacting; validate handle-matched cleanup.

Acceptance checks

  1. One active run per sessionKey at any time.
  2. Abort leaves no active run or dangling subscriptions.
  3. “Completed” is not emitted while compaction retries are pending.
  4. Steer failures are explainable (return false), not random exceptions.