Provider fallback
When a provider fails, HiWay retries against the cheapest same-tier model. Max 2 retries.
Providers fail. Sometimes it's a 503 from the upstream, sometimes a timeout, sometimes a content-policy rejection. HiWay does not let a single transient provider error break your app: on failure, it auto-retries against the next cheapest model in the same tier, up to 2 retries. The final response carries _hiway.fallback = true so you know the retry chain fired.
Retry logic
- Primary routing decision (tier T, model M1).
- Call provider for M1. If it returns 5xx / timeout / provider-specific retryable error — fall to retry 1.
- Retry 1: next cheapest model in tier T across your enabled providers (M2). Same logic.
- Retry 2: next cheapest after M2 (M3).
- If all three fail, return 502 with the last upstream error body and
_hiway.fallback_chainlisting the three attempts.
What doesn't trigger fallback
- 4xx errors from the upstream (bad request, auth) — those are your problem, not a provider failure.
- Content-policy rejections that are deterministic (same prompt would fail on any model).
- Requests pinned to a specific model (e.g.
model="openai/gpt-4o") — we respect the pin and surface the error.
Fallback counts as one request
Even if fallback retries against two additional models, it still counts as a single request against your monthly quota. You only pay the upstream BYOK cost of the attempts that actually ran.