Provider fallback

When a provider fails, HiWay retries against the cheapest same-tier model. Max 2 retries.

Providers fail. Sometimes it's a 503 from the upstream, sometimes a timeout, sometimes a content-policy rejection. HiWay does not let a single transient provider error break your app: on failure, it auto-retries against the next cheapest model in the same tier, up to 2 retries. The final response carries _hiway.fallback = true so you know the retry chain fired.

Retry logic

  1. Primary routing decision (tier T, model M1).
  2. Call provider for M1. If it returns 5xx / timeout / provider-specific retryable error — fall to retry 1.
  3. Retry 1: next cheapest model in tier T across your enabled providers (M2). Same logic.
  4. Retry 2: next cheapest after M2 (M3).
  5. If all three fail, return 502 with the last upstream error body and _hiway.fallback_chain listing the three attempts.

What doesn't trigger fallback

  • 4xx errors from the upstream (bad request, auth) — those are your problem, not a provider failure.
  • Content-policy rejections that are deterministic (same prompt would fail on any model).
  • Requests pinned to a specific model (e.g. model="openai/gpt-4o") — we respect the pin and surface the error.

Fallback counts as one request

Even if fallback retries against two additional models, it still counts as a single request against your monthly quota. You only pay the upstream BYOK cost of the attempts that actually ran.