Prompt Caching（提示缓存）

提示缓存（prompt caching）允许提供商复用“未变化的提示前缀”（通常是 system/developer 指令与稳定上下文），从而降低成本与延迟。使用统计里常见的体现是 cacheWrite（首次写入）与 cacheRead（后续复用）。

主要开关

在模型参数中配置：

agents:
  defaults:
    models:
      "anthropic/claude-opus-4-6":
        params:
          cacheRetention: "short" # none | short | long

按智能体覆盖：

agents:
  list:
    - id: "alerts"
      params:
        cacheRetention: "none"

旧 TTL 值可能仍被兼容并映射（例如 5m → short，1h → long）。新配置建议使用 cacheRetention。

agents:
  defaults:
    contextPruning:
      mode: "cache-ttl"
      ttl: "1h"

Heartbeat 可以帮助保持缓存窗口“温热”：

agents:
  defaults:
    heartbeat:
      every: "55m"