System architecture

System architecture

This page gives you a “global map” of OpenClaw so you can locate issues quickly when reading logs, events, and configuration.

Overview (ingress → control plane → execution)

OpenClaw is best understood as a “multi-ingress, single-kernel” runtime:

  • Ingress: Channels (WhatsApp/Telegram/Discord…) and automation entry points (Webhooks/Cron…)
  • Control plane: Gateway (WebSocket/HTTP APIs, auth, routing, state/event broadcast)
  • Execution plane: Agent (run/attempt lifecycle, lane/queue concurrency, streaming)
  • Capability layer: Tools (Browser/Exec/Web…) and Providers (models + failover)
  • Data layer: Sessions, Media, Config, logging/auditing

If you’ve just finished onboarding, these three entry points matter most:

Key components

Gateway (control plane)

Gateway is the long-running core process. Typical responsibilities:

  • Manage channel/provider connections and their lifecycle
  • Expose WebSocket/HTTP APIs and broadcast events
  • Enforce auth boundaries (token / pairing / remote access)
  • Coordinate sessions & concurrency (sessionKey / lane / queue)

See:

Channels (message ingress)

Channels are where messages enter the system. While platforms differ in connection and message formats, they should converge to the same pipeline: ingress → routing → session.

See:

Routing + sessionKey (how messages get “bucketed”)

The system needs a stable mapping from incoming messages to sessions. The sessionKey typically determines:

  • What runs serially (same sessionKey)
  • What can run in parallel (different sessionKeys)
  • How you avoid “cross-talk” (groups, multi-account, multi-channel)

See:

Agent (execution plane: run/attempt)

A typical execution looks like:

  1. An incoming message is routed to a sessionKey
  2. It enters lane/queue (serial per session, parallel across sessions)
  3. A run starts, potentially with multiple attempts (retry/failover)
  4. Tool calls and streaming events happen during the run
  5. The final reply is delivered via the channel

See:

Tools (turn “say” into “do”)

Tools execute side-effectful actions (system commands, browser automation, HTTP calls). The key is controllability, not just capability.

See:

Providers (models + failover)

Providers turn prompts/context into model output, with controlled fallbacks for failure scenarios.

See:

A complete message journey (make it observable)

Breaking the flow into observable steps helps troubleshooting:

  1. Ingress: a channel receives a message (or Webhook/Cron triggers)
  2. Routing: build a sessionKey from account/group/thread context
  3. Concurrency: lane/queue (serial per sessionKey)
  4. Execution: agent run with tools + streaming events
  5. Egress: the reply returns via the channel

A practical acceptance check: in Control UI (and logs), you can follow the same runId through those milestones (see logging).

Security boundaries (layer risk controls)

Avoid treating security as “prompt-only”. Separate concerns by layer:

  • Ingress boundary: remote access, trusted proxy, token, pairing (who can connect)
  • Execution boundary: sandbox / tool policy / approvals (what tool calls can do)
  • Auditability: logs + diagnosis + minimal reproducible flows

See:

Next steps

  • Want a verified first run: start at Quickstart
  • Want internals: go to Deep dive (Track A/B)
  • Want production: start from Gateway remote access + security baseline