Strict spend protection

Atomic pre-flight reservation. You cannot overdraft. Period.

The hard guarantee

Under no circumstance — single request, parallel burst, model misprice, anything — can a HiWay user spend more credits than they hold. Server-side, atomic, tested.

Why naive proxies overdraft

The obvious flow — check balance > 0, forward the request, deduct the actual cost on completion — is unsafe in two ways:

  • Single request can overspend. A user with $0.001 of credits can send a Claude Opus request that costs $5. The check passes, the request fires, the deduction lands the balance at -$4.999.
  • Parallel requests overdraft. Ten concurrent requests at balance = $1 all pass the > 0 check, all forward, all deduct. Final balance: a deeply negative number.

How HiWay solves it

Every request goes through this server-side pipeline before any provider call:

  1. Estimate the worst case. (input_chars / 3 + 50) × input_price/M + max_tokens × output_price/M, with max_tokens defaulted to 4096 if unset.
  2. Atomic reserve. A Lua script runs server-side in Redis: it reads the balance and decrements by the estimate in a single atomic operation. No race window.
  3. Reject on insufficient balance. If the reservation would land the balance below zero, the request is rejected with HTTP 402 — the upstream provider is never called.
  4. Forward. Now that the credits are locked, the request is sent to LiteLLM and on to the provider.
  5. Refund the difference. Once the actual cost is known from the response, HiWay credits back the unused portion (reservation − actual). The final balance reflects the real spend, to the cent.
  6. Refund on failure. If anything raises before the response — backend down, timeout, malformed payload — the full reservation is refunded.

Tested under fire

The concurrency test fans out 50 parallel reservations against a $10 balance. At most 33 succeed (3 × 33 ≤ 10), the rest are rejected, and the final balance is never negative. Try it yourself in tests/test_strict_spend.py.