opencode-model-fallback

OpenCode plugin that adds automatic model fallback when your primary model hits a rate limit or quota. Instead of waiting in a retry loop, it immediately switches to the next healthy model in a configured chain — per-agent, with a health state machine that tracks recovery.

How it works

Preemptive redirect — intercepts outgoing messages via chat.message hook; if the target model is known to be rate-limited, redirects the message to a healthy fallback before it hits the provider (no 429 round-trip)
Reactive fallback — if a 429 still occurs (first hit, or preemptive not available), listens for session.status: retry events, aborts the retry loop, reverts the failed message, and replays it with the next healthy fallback model
Shows an inline toast notification and logs the event
Tracks model health globally (rate limits are account-wide) — automatically recovers after configurable cooldown periods
Depth reset — when the TUI reverts to the original model between messages, fallbackDepth resets so maxFallbackDepth only guards true cascading failures within a single message

Installation

Add to the plugin array in your ~/.config/opencode/opencode.jsonc:

{
  "plugin": [
    // ... existing plugins
    "@smart-coders-hq/opencode-model-fallback",
  ],
}

Or load locally during development:

{
  "plugin": ["file:///path/to/opencode-model-fallback/dist/index.js"],
}

Then create a config file (see Configuration).

Configuration

Place model-fallback.json at either:

.opencode/model-fallback.json — project-local
~/.config/opencode/model-fallback.json — global

{
  "enabled": true,
  "defaults": {
    "fallbackOn": [
      "rate_limit",
      "quota_exceeded",
      "5xx",
      "timeout",
      "overloaded"
    ],
    "cooldownMs": 300000,
    "retryOriginalAfterMs": 900000,
    "maxFallbackDepth": 3
  },
  "agents": {
    "*": {
      "fallbackModels": [
        "anthropic/claude-sonnet-4-20250514",
        "google/gemini-flash-2-5"
      ]
    }
  },
  "patterns": [
    "rate limit",
    "usage limit",
    "too many requests",
    "quota exceeded",
    "overloaded",
    "capacity exceeded",
    "credits exhausted",
    "billing limit",
    "429"
  ],
  "logging": true,
  "logLevel": "info",
  "logPath": "~/.local/share/opencode/logs/model-fallback.log"
}

All config fields

Field	Type	Default	Description
`enabled`	boolean	`true`	Enable/disable the plugin
`defaults.fallbackOn`	string[]	all categories	Error categories that trigger fallback
`defaults.cooldownMs`	number	`300000` (5 min)	How long before a rate-limited model enters cooldown. Min: 10000
`defaults.retryOriginalAfterMs`	number	`900000` (15 min)	How long before a cooldown model is considered healthy again. Min: 10000
`defaults.maxFallbackDepth`	number	`3`	Maximum number of fallbacks per session. Max: 10
`agents`	object	`{"*": {}}`	Per-agent fallback chains (see below)
`patterns`	string[]	see defaults	Case-insensitive substrings to match in retry messages
`logging`	boolean	`true`	Write structured logs to `logPath`
`logLevel`	string	`"info"`	Minimum log level written to file: `"info"` suppresses debug noise, `"debug"` logs every event (useful for incident investigation)
`logPath`	string	`~/.local/share/opencode/logs/model-fallback.log`	Log file path (must be within `$HOME`)

Error categories

rate_limit — 429, "rate limit", "too many requests", "usage limit"
quota_exceeded — "quota exceeded", "credits exhausted", "billing limit"
overloaded — "overloaded", "capacity exceeded"
timeout — "timeout", "timed out"
5xx — 500/502/503/504, "internal server error", "bad gateway"

Per-agent chains

Configure different fallback chains for different agents using the agent name as the key. The "*" wildcard is used for any agent without a specific entry.

{
  "agents": {
    "build": {
      "fallbackModels": [
        "anthropic/claude-sonnet-4-20250514",
        "google/gemini-3-pro",
        "openai/gpt-4o"
      ]
    },
    "coder": {
      "fallbackModels": ["anthropic/claude-sonnet-4-20250514"]
    },
    "*": {
      "fallbackModels": [
        "anthropic/claude-sonnet-4-20250514",
        "google/gemini-flash-2-5"
      ]
    }
  }
}

Models are tried in order. Rate-limited models are skipped; cooldown models are used as a last resort.

Migrating from opencode-rate-limit-fallback

If you have an existing rate-limit-fallback.json config, the plugin auto-migrates it on load — no manual steps needed.

Old format:

{
  "fallbackModel": "anthropic/claude-opus-4-5",
  "cooldownMs": 300000,
  "patterns": ["rate limit"],
  "logging": true
}

Automatically converted to:

{
  "agents": { "*": { "fallbackModels": ["anthropic/claude-opus-4-5"] } },
  "defaults": { "cooldownMs": 300000 },
  "patterns": ["rate limit"],
  "logging": true
}

The plugin checks both rate-limit-fallback.json and model-fallback.json — old configs are found and migrated automatically.

`/fallback-status` command

Run /fallback-status in any OpenCode session to see:

Current session's fallback depth and history
Health state of all tracked models (healthy / cooldown / rate_limited) with time remaining
Which agent is active

With the verbose flag:

/fallback-status verbose:true

Includes token/cost breakdown per model period.

Health state machine

healthy ──[rate limit detected]──→ rate_limited
rate_limited ──[cooldownMs elapsed]──→ cooldown
cooldown ──[retryOriginalAfterMs elapsed]──→ healthy

healthy — model is usable; preferred for fallback selection
rate_limited — recently hit a limit; skipped when walking fallback chain
cooldown — cooling off; used as last resort if no healthy model is available
State transitions are checked every 30 seconds via a background timer
When the original model recovers to healthy, a toast appears on the next session.idle

Troubleshooting

Toast doesn't appear The TUI notification requires an active OpenCode TUI session. Headless/API usage won't show toasts but logs are always written.

"no fallback chain configured" Your model-fallback.json has no agents["*"].fallbackModels (or no entry for the active agent). Add at least a wildcard entry with one model.

"all fallback models exhausted" All configured fallback models are currently rate-limited. Wait for cooldownMs to elapse or add more models to the chain.

"max fallback depth reached" The session has hit maxFallbackDepth cascading fallbacks within a single message (all models failing in sequence). Depth resets automatically when the TUI reverts to the original model between messages, so this typically indicates all configured models are rate-limited simultaneously. Start a new session or increase maxFallbackDepth in config.

Check the logs:

tail -f ~/.local/share/opencode/logs/model-fallback.log | jq .

Key log events: plugin.init, retry.detected, fallback.success, fallback.exhausted, health.transition, recovery.available

To see the full event stream (including event.received and retry.nomatch), set "logLevel": "debug" in your config and restart OpenCode.

Release automation

Uses Conventional Commits + semantic-release for automated versioning/changelog/release notes
CI runs lint, tests, type check, and build on every push/PR via .github/workflows/ci.yml
Release workflow runs on main after successful CI via .github/workflows/release.yml
Published as @smart-coders-hq/opencode-model-fallback
To publish to npm, set repository secret NPM_TOKEN

Development

bun install
bun run lint          # lint checks
bun test              # 145 tests across 11 files
bunx tsc --noEmit     # type check
bun run build         # build to dist/

Load locally in OpenCode:

{ "plugin": ["file:///absolute/path/to/dist/index.js"] }

Config for testing: place model-fallback.json in .opencode/ in your project directory.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.claude		.claude
.github		.github
examples		examples
scripts		scripts
src		src
test		test
.gitignore		.gitignore
.releaserc.json		.releaserc.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Implementation.plan.md		Implementation.plan.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
biome.json		biome.json
bun.lock		bun.lock
index.ts		index.ts
opencode-fallback-prompt.md		opencode-fallback-prompt.md
package.json		package.json
plugin.json		plugin.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

opencode-model-fallback

How it works

Installation

Configuration

All config fields

Error categories

Per-agent chains

Migrating from opencode-rate-limit-fallback

`/fallback-status` command

Health state machine

Troubleshooting

Release automation

Development

About

Uh oh!

Releases 8

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

opencode-model-fallback

How it works

Installation

Configuration

All config fields

Error categories

Per-agent chains

Migrating from opencode-rate-limit-fallback

/fallback-status command

Health state machine

Troubleshooting

Release automation

Development

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`/fallback-status` command

Packages