Learn how smart routing works, integrate any chat-completions client in 2 minutes, and master Budget Controls, auto-recharge and the Guardian system.
Start with the quickstartFrom signup to your first routed request in under 2 minutes.
Python, TypeScript and a one-liner CLI. MIT-licensed. Use HiWay without touching the dashboard.
OpenAI, Anthropic, LangChain, Vercel AI SDK, n8n - change one line.
How to sign requests with your hw_live_ key.
Seven analyzers, sub-millisecond decision, no LLM call.
BYOK degressive markup. Your providers bill you directly. 9-12.5% on Scale.
Per-workspace rules to block runaway agents, duplicate traffic and cost spikes.
Cap your monthly upstream BYOK spend. Verdict: BLOCK, DOWNGRADE or LIGHT_ONLY.
When a provider fails, HiWay retries against the cheapest same-tier model. Max 2 retries.
Skip identical and near-identical requests entirely - zero tokens, instant replies.
We add `cache_control` breakpoints to your Anthropic requests automatically. ~10x cheaper input on cache hits, zero config.
Opt-in. Regex on email / phone / card / IBAN / API keys before cache hashing.
Wallet hits 0? Service keeps running BYOK-direct for 72h / 100k tokens, then a soft stop until you top up.
Run N variants of a request in parallel across models. Compare cost, latency, quality.
OpenAI-compatible body + _hiway metadata + X-HiWay-Routed-* headers.
How HiWay forwards Server-Sent Events end-to-end.
HiWay is tool-calling transparent across every supported provider.
Why the system prompt affects which tier your request is routed to.
Per-workspace thresholds, alert webhooks, rule-level on/off.
Monthly USD cap on upstream BYOK cost.
vector-store-backed cache - available on all packs.
Regex masking before cache, embedding and (optionally) provider call.
Benchmark models on real traffic without writing glue code.
7-layer AI system that monitors, learns your optimal model mix, enforces sovereignty, and lets you chat about your own data.
Two-tier prompt security that blocks injection, jailbreaks, PII leaks, and secrets, under 2 ms, always on.
Per-workspace and per-key modes, custom thresholds, IP rules, and SIEM webhooks.
Detailed description of each threat type, example payloads, and tuning guidance.
How the Security Shield supports GDPR, SOC 2, and enterprise audit requirements.
Skip CORTEX routing for explicit model requests. Forward straight to the provider. Standard markup still applies.
Generate your first image, audio, or embedding in under 5 minutes.
T1 / T2 / T3 - cost vs quality tradeoffs, per modality.
fal.ai Flux, OpenAI DALL-E 3, Stability AI SD3 - via your own keys.
Async jobs, preview gate, fal.ai + Runway - up to 10 seconds.
Text-to-speech with 7-day cache. Speech-to-text up to 250 MB.
BYOK embeddings with 30-day Redis cache. OpenAI, Cohere, Voyage AI.
Drop-in: change one line, route through HiWay.
The TypeScript/JavaScript path to HiWay.
Point LangChain at HiWay with one base_url change.
Streaming chat UIs with HiWay as the provider.
Use HiWay as the LLM endpoint for every n8n AI node.
No SDK? No problem. HiWay speaks standard chat-completions.
Swap two lines, keep every other part of your integration.
Managed alternative to your self-hosted router - when the ops cost stops being worth it.
Stay on Vercel if you want. Just swap the upstream from Vercel's gateway to HiWay.
BYOK + smart routing, minus Portkey's guardrails and prompt library.
Keep your provider account, add a router - smart model selection without rewriting a line of logic.
The core routing endpoint.
Current workspace quota, plan and cumulative savings.
List the models available through your enabled providers.
Every HTTP status HiWay returns and what it means.
Subscribe to budget, Guardian, payment, key rotation, and account events. Signed payloads with retry.
When HiWay refuses a request because a cap was hit.
Your API key didn't authenticate. Here's how to fix it.
Who rate-limited you, and how to unblock.
When all fallback attempts failed.
The questions we get the most, answered in one place.
HiWay-specific terms explained.
What shipped, when.