๐Ÿ“‰Cost Optimization

How to Reduce Your OpenClaw API Costs

Intermediate1-2 hoursUpdated 2025-01-20

OpenClaw's default configuration sends every request to Claude Opus, the most expensive model, resulting in unnecessarily high API costs. This guide shows you how to implement model routing, configure token budgets, tune compaction settings, and add automation guardrails to significantly reduce your API costs without sacrificing quality.

Why This Is Hard to Do Yourself

These are the common pitfalls that trip people up.

๐Ÿ”ฅ

"Opus for everything"

Default OpenClaw sends every request to Claude Opus, the most expensive model, even for simple tasks

๐Ÿ“Š

No usage visibility

No built-in dashboard shows which skills or conversations are burning tokens

๐Ÿ”„

Compaction costs hidden

Long conversations trigger automatic compaction which uses expensive model calls just to summarize context

๐Ÿค–

Runaway automations

A single misconfigured automation loop can burn through $500+ in tokens overnight

Step-by-Step Guide

Step 1

Audit your current token usage

Identify where tokens are going.

# Check your API provider dashboard for:
# - Total tokens per day/week/month
# - Breakdown by model (Opus vs Sonnet vs Haiku)
# - Top conversations by token count

# OpenClaw logs token usage per request:
tail -100 ~/.openclaw/logs/tokens.log | sort -t',' -k3 -rn | head -20
Step 2

Set up OpenRouter for model routing

Route simple tasks to cheaper models.

# In .env:
OPENROUTER_API_KEY=sk-or-...

# In config/models.yaml:
routing:
  default: anthropic/claude-3.5-sonnet
  complex_tasks: anthropic/claude-3-opus
  simple_tasks: anthropic/claude-3-haiku
  rules:
    - pattern: "summarize|format|translate"
      model: anthropic/claude-3-haiku
    - pattern: "analyze|reason|code-review"
      model: anthropic/claude-3-opus
Step 3

Configure token budgets

Set per-conversation and global limits.

# In config/budgets.yaml:
global:
  daily_limit: 500000  # tokens
  monthly_limit: 10000000
per_conversation:
  max_tokens: 50000
  warning_at: 40000
per_skill:
  web-scraper: 100000
  code-review: 200000
Step 4

Tune compaction settings

Reduce expensive context summarization.

# In config/compaction.yaml:
compaction:
  trigger_threshold: 0.8  # Start at 80% context window (default: 0.6)
  model: anthropic/claude-3-haiku  # Use cheap model for compaction
  max_summary_tokens: 2000
  preserve_recent_messages: 10
Step 5

Add automation guardrails

Prevent runaway loops.

# In config/automations.yaml:
guardrails:
  max_iterations: 50
  max_tokens_per_run: 100000
  timeout_minutes: 30
  require_confirmation_above: 10000  # tokens

Warning: Without guardrails, a single automation error can generate thousands of API calls. Always set limits before enabling any automation.

Step 6

Set up cost alerts

Get notified before bills spike.

# In config/alerts.yaml:
alerts:
  - type: daily_spend
    threshold: 20.00  # USD
    channel: email
  - type: hourly_spike
    threshold: 500  # % above normal
    channel: slack_webhook

Stop Burning Money on API Calls

Our cost optimization experts configure model routing, budgets, compaction, and monitoring to significantly reduce your API spend.

Get matched with a specialist who can help.

Sign Up for Expert Help โ†’

Frequently Asked Questions