Strict spend protection
Atomic pre-flight reservation. You cannot overdraft. Period.
The hard guarantee
Under no circumstance — single request, parallel burst, model misprice, anything — can a HiWay user spend more credits than they hold. Server-side, atomic, tested.
Why naive proxies overdraft
The obvious flow — check balance > 0, forward the request, deduct the actual cost on completion — is unsafe in two ways:
- Single request can overspend. A user with $0.001 of credits can send a Claude Opus request that costs $5. The check passes, the request fires, the deduction lands the balance at -$4.999.
- Parallel requests overdraft. Ten concurrent requests at balance = $1 all pass the
> 0check, all forward, all deduct. Final balance: a deeply negative number.
How HiWay solves it
Every request goes through this server-side pipeline before any provider call:
- Estimate the worst case.
(input_chars / 3 + 50) × input_price/M + max_tokens × output_price/M, withmax_tokensdefaulted to 4096 if unset. - Atomic reserve. A Lua script runs server-side in Redis: it reads the balance and decrements by the estimate in a single atomic operation. No race window.
- Reject on insufficient balance. If the reservation would land the balance below zero, the request is rejected with HTTP 402 — the upstream provider is never called.
- Forward. Now that the credits are locked, the request is sent to LiteLLM and on to the provider.
- Refund the difference. Once the actual cost is known from the response, HiWay credits back the unused portion (reservation − actual). The final balance reflects the real spend, to the cent.
- Refund on failure. If anything raises before the response — backend down, timeout, malformed payload — the full reservation is refunded.
Tested under fire
The concurrency test fans out 50 parallel reservations against a $10 balance. At most 33 succeed (3 × 33 ≤ 10), the rest are rejected, and the final balance is never negative. Try it yourself in tests/test_strict_spend.py.