System architecture

This page gives you a “global map” of OpenClaw so you can locate issues quickly when reading logs, events, and configuration.

Overview (ingress → control plane → execution)

OpenClaw is best understood as a “multi-ingress, single-kernel” runtime:

Ingress: Channels (WhatsApp/Telegram/Discord…) and automation entry points (Webhooks/Cron…)
Control plane: Gateway (WebSocket/HTTP APIs, auth, routing, state/event broadcast)
Execution plane: Agent (run/attempt lifecycle, lane/queue concurrency, streaming)
Capability layer: Tools (Browser/Exec/Web…) and Providers (models + failover)
Data layer: Sessions, Media, Config, logging/auditing

If you’ve just finished onboarding, these three entry points matter most:

Observe & interact: Control UI
Diagnose: doctor
Troubleshoot: troubleshooting

Key components

Gateway (control plane)

Gateway is the long-running core process. Typical responsibilities:

Manage channel/provider connections and their lifecycle
Expose WebSocket/HTTP APIs and broadcast events
Enforce auth boundaries (token / pairing / remote access)
Coordinate sessions & concurrency (sessionKey / lane / queue)

See:

Channels (message ingress)

Channels are where messages enter the system. While platforms differ in connection and message formats, they should converge to the same pipeline: ingress → routing → session.

See:

Routing + sessionKey (how messages get “bucketed”)

The system needs a stable mapping from incoming messages to sessions. The sessionKey typically determines:

What runs serially (same sessionKey)
What can run in parallel (different sessionKeys)
How you avoid “cross-talk” (groups, multi-account, multi-channel)

See:

Agent (execution plane: run/attempt)

A typical execution looks like:

An incoming message is routed to a sessionKey
It enters lane/queue (serial per session, parallel across sessions)
A run starts, potentially with multiple attempts (retry/failover)
Tool calls and streaming events happen during the run
The final reply is delivered via the channel

See:

Tools (turn “say” into “do”)

Tools execute side-effectful actions (system commands, browser automation, HTTP calls). The key is controllability, not just capability.

See:

Providers (models + failover)

Providers turn prompts/context into model output, with controlled fallbacks for failure scenarios.

See:

A complete message journey (make it observable)

Breaking the flow into observable steps helps troubleshooting:

Ingress: a channel receives a message (or Webhook/Cron triggers)
Routing: build a sessionKey from account/group/thread context
Concurrency: lane/queue (serial per sessionKey)
Execution: agent run with tools + streaming events
Egress: the reply returns via the channel

A practical acceptance check: in Control UI (and logs), you can follow the same runId through those milestones (see logging).

Security boundaries (layer risk controls)

Avoid treating security as “prompt-only”. Separate concerns by layer:

Ingress boundary: remote access, trusted proxy, token, pairing (who can connect)
Execution boundary: sandbox / tool policy / approvals (what tool calls can do)
Auditability: logs + diagnosis + minimal reproducible flows

See:

Next steps

Want a verified first run: start at Quickstart
Want internals: go to Deep dive (Track A/B)
Want production: start from Gateway remote access + security baseline

Streaming and Chunking System Prompt