HiWay2LLM analyzes every request in <1ms and routes it to the optimal model across your own API keys. Simple messages go to cheap models. Complex tasks go to powerful ones. You save 40-60% on typical mixes, at zero markup.
<1ms
Routing latency
9%
Minimum markup (Enterprise)
0
Prompts stored
200+
Models across LLM, image, video, audio & embeddings
How it fits together
HiWay2LLM sits between your code and the LLM providers. Your keys. Your data. Our routing intelligence.
200+ models · LLM · Image · Video · Audio · Embeddings - all via BYOK
From signup to your first routed request in under 2 minutes.
Create an account in 30 seconds. Email + password, free tier activates immediately - 2M tokens/month, no credit card.
Plug in your own keys for any supported provider - LLM (Anthropic, OpenAI, Google, Mistral, Groq…), image (Flux, Stability AI, fal.ai), video (Kling, Runway), audio (ElevenLabs) or embeddings (Cohere, Voyage AI). They stay encrypted on our side and you keep billing with your providers directly. Zero markup on inference.
Point your SDK's base_url at HiWay2LLM. One endpoint reaches every model you've enabled, and the router picks the cheapest model that can handle each request. OpenAI-compatible. Works with any SDK.
Point your existing code to HiWay2LLM. We handle the rest.
Haiku 4.5 / GPT-4o-mini / Gemini 2.5 Flash Lite
65% of requests
Sonnet 4.6 / GPT-4o / Gemini 2.5 Flash
28% of requests
Opus 4.7 / GPT-5 / Gemini 2.5 Pro
7% of requests
7 analyzers, burn-rate alerting, multi-provider optimization - and CORTEX, the AI that self-tunes your router while you ship.
7 analyzers detect intent, complexity, tools, and code in under a millisecond. No LLM call for routing - pure CPU.
Baseline every agent, detect prompt inflation, silent escalations to premium models and pricing drift. Alerts, rollback, per-agent budgets. Built for CTOs who want total control of their LLM spend.
We watch your spend in real time. Burn-rate thresholds, anomaly detection, and per-key alerts fire the moment something looks off - before your monthly bill does.
No LLM provider offers this. Set daily/monthly caps, per-model limits, off-hours rules, and automatic degradation.
Per-user CSV exports, daily breakdowns by model, token-level cost attribution. Plug it into your invoicing or your accounting in two clicks.
Bring your own keys from any provider - LLM (Anthropic, OpenAI, Google, Mistral, Groq, Together AI, Replicate…), image (Flux, Stability AI, fal.ai), video (Kling, Runway, Luma), audio (ElevenLabs, HeyGen), and embeddings (Cohere, Voyage AI). One API, all modalities.
Change your base_url. That's it. Compatible with any LLM SDK - OpenAI, Anthropic, LangChain, Vercel AI, n8n.
Your prompts never touch our disk. Architectural guarantee. GDPR and EU AI Act compliant.
Proactive AI that reads Guardian events, auto-tunes routing thresholds, and pushes insights to your CORTEX Inbox - so you see problems before your users do. Scale and Enterprise.
Two-tier scanner catches injection, jailbreaks, PII leaks, and secrets in under 2 ms, before they reach the model. Zero latency in monitor mode.
Prompt injection
Blocks "ignore all previous instructions", DAN mode, developer mode, and persona-override patterns.
Prompt extraction
Catches attempts to read your system prompt or internal instructions.
Jailbreak
Stops requests for malware, exploits, synthesis of controlled substances, and illegal content.
PII detection
Flags email addresses, phone numbers, IBANs, and tax identifiers before they reach the LLM, GDPR compliant.
Secret leakage
Catches API keys (OpenAI, Anthropic, GitHub PAT, Bearer tokens) accidentally pasted into prompts.
30-second CLI, OpenAI-compatible Python + TypeScript SDKs. Zero vendor lock-in - you can leave HiWay without touching a line of application code.
One-line install, signup from the terminal, first call without writing code. Perfect to kick the tires before integrating.
npm i -g @hiway2llm/cli hw signup hw chat "explain this in 3 bullets"
Drop-in import. Every method from the OpenAI SDK works - we just route to the right model.
pip install hiway2llm
from hiway2llm import Hiway
cli = Hiway(api_key="hw_live_...")
cli.chat("Say hi")Native fetch client, works in Node and Edge runtimes (Vercel, Cloudflare Workers).
npm i @hiway2llm/client
import { Hiway } from "@hiway2llm/client";
const h = new Hiway({ apiKey: "hw_live_..." });
await h.chat("Say hi");Keep your Anthropic key, pay Anthropic directly. HiWay measures consumption and bills a % markup on the actual routed cost - fully offset by routing savings.
Start free. Scale when you're ready.
No credit card required · Cancel anytime · Instant access
Routage intelligent − frais HiWay2LLM = gain net
Profil d'usage
Mix estimé : 40% Haiku · 50% Sonnet · 10% Opus
Économie nette / mois
+$501
soit +50% sur ta facture actuelle
Simulation indicative · basée sur le mix modèles typique de votre profil
Free
Pour tester et prototyper.
Scale
jusqu'à
−60%sur tes coûts IA réels · CORTEX route vers le modèle optimal
Dégressif : <$500 → 12,5% · $500-5K → 11% · $5K-20K → 10%
Enterprise
$20K-50K/mois → 9% · au-delà : sur-mesure négocié
Toutes les fonctionnalités core sont disponibles dès le premier pack. Les features avancées s'ouvrent avec Scale et Enterprise.
| Fonctionnalité | FreeRoutage de base · 10M/mois | ScaleMarkup 12,5 → 10% | EnterpriseSur devis |
|---|---|---|---|
| USAGE & QUOTAS | |||
| Tokens inclus | par pack acheté | 1B / achat | custom |
| Auto-reload | |||
| Sièges équipe | 3 | 25 | ∞ |
| Workspaces | 1 | 5 | ∞ |
| Conservation analytics | 30j | 1 an | ∞ |
| MOTEUR DE ROUTAGE | |||
| Smart routing (model=auto) | |||
| BYOK fournisseurs | |||
| 0 % marge sur l'inférence | |||
| Fallback automatique | |||
| Guardian anti-loop | |||
| CORTEX alertes Inbox | |||
| CONTRÔLES AVANCÉS | |||
| Cache sémantique | |||
| A/B testing modèles | |||
| Journal d'audit | |||
| CORTEX complet (5 phases) | |||
| SSO (Google, Microsoft) | |||
| Masquage PII | |||
| Self-hosted | |||
| Règles routage custom | |||
| SUPPORT & CONFORMITÉ | |||
| Canal de support | Priority | SLA 99.99% | |
| DPA (RGPD) | |||
| Financement disponible | |||
| Ingénieur dédié | |||
L'inférence est toujours facturée directement par vos fournisseurs LLM, sur vos propres clés. Les prix ci-dessus n'incluent pas l'inférence.
EVERY PLAN INCLUDES
BYOK - bring your own keys from any supported provider: LLM (Anthropic, OpenAI, Google, Mistral, Groq, Together AI, Replicate, Cohere…), image (Flux/BFL, Stability AI, fal.ai), video (Kling, Runway, Luma AI), audio (ElevenLabs, HeyGen). Inference is billed directly by your providers. HiWay only charges a % markup on the actual routed cost.
Your users send simple messages 70% of the time. Why pay Opus prices for a greeting?
Start freeHonest side-by-side. Updated 2026-04-22 against each vendor's public docs.
| Feature | HiWay2LLM | OpenRouter | Portkey | LiteLLM | Requesty |
|---|---|---|---|---|---|
| Bring your own keys (BYOK) | |||||
| Smart routing by request complexity | |||||
| OpenAI-compatible API | |||||
| Automatic fallback across providers | |||||
| Prompt caching (Anthropic / OpenAI) | |||||
| Per-workspace analytics + audit log | |||||
| Burn-rate alerts (budget spikes) | |||||
| EU hosting by default (GDPR) | self-host | ||||
| Zero prompt logging | |||||
| AI self-management (CORTEX) | |||||
| Pricing model | flat €/mo | % markup | flat + % markup | self-host / SaaS | % markup |
native · partial / plugin · not offered. We check these claims against each vendor's public docs - if you spot an inaccuracy, tell us.