Passthrough mode & grace cap
Wallet hits 0? Service keeps running BYOK-direct for 72h / 100k tokens, then a soft stop until you top up.
When your HiWay wallet balance reaches zero, we don't cut your traffic. Routing automatically switches to passthrough mode: requests go straight to your BYOK provider (no markup, no smart routing, no semantic cache). You keep shipping, your wallet just sits at zero until your next top-up.
Why we built this
Stripe top-ups take a few minutes to settle. Aggregator outages or expired cards shouldn't cut your production traffic mid-day. Passthrough is a graceful degradation layer, it kicks in automatically and reverts automatically on top-up.
What changes in passthrough
- Requests are billed by your BYOK provider directly (as always).
- HiWay markup = 0% during passthrough.
- Smart routing is suspended, your request uses the exact model you ask for, no auto-downgrade.
- Semantic cache is suspended (writes paused, reads still served).
- Observability, audit logs, and Security Shield keep running.
The grace cap (72h or 100k tokens)
Passthrough isn't unlimited. Each passthrough cycle is capped at 72 hours OR 100,000 tokens consumed, whichever comes first. Beyond that, HiWay returns HTTP 402 on new requests until you top up. The cap resets the moment your wallet is credited again.
| Trigger | Email sent | Behavior |
|---|---|---|
| Wallet hits 0 | Passthrough activated | Service continues, markup = 0% |
| 50% of cap (36h OR 50k tokens) | Warning email | Same, heads-up only |
| 100% of cap (72h OR 100k) | Hard-cut email | New requests return HTTP 402 |
| Wallet topped up | Passthrough deactivated | Back to normal mode, cap cleared |
When you hit the cap
The error body distinguishes between the three refusal reasons so your client can react appropriately:
{
"error": {
"type": "payment_required",
"code": "grace_period_exceeded_time",
"message": "Grace period exceeded. Please top up your wallet.",
"passthrough_since": "2026-05-26T08:00:00Z",
"elapsed_hours": 74.2,
"tokens_consumed": 65000,
"topup_url": "https://app.hiway2llm.com/wallet"
}
}code is one of grace_period_exceeded_time, grace_period_exceeded_tokens, or grace_period_hard_cut (already past the cap on previous request).
Admin override available
Enterprise customers can request a temporary cap extension from support, useful for known payment delays or month-end batch jobs. Overrides are audit-logged and time-boxed.
How to monitor
Dashboard → Wallet shows your current state. If you're in passthrough, it displays how much of the cap you've consumed (hours + tokens) and the projected hard-cut time. You can wire your own webhook on the account.low_balance event to be notified outside the dashboard.