Now Live· 200+ Models · LLM · Image · Video · Audio · BYOK

Use the best model.
Pay for the cheapest.

HiWay2LLM analyzes every request in <1ms and routes it to the optimal model across your own API keys. Simple messages go to cheap models. Complex tasks go to powerful ones. You save 40-60% on typical mixes, at zero markup.

<1ms

Routing latency

9%

Minimum markup (Enterprise)

0

Prompts stored

200+

Models across LLM, image, video, audio & embeddings

How it fits together

One thin layer between your app and the models

HiWay2LLM sits between your code and the LLM providers. Your keys. Your data. Our routing intelligence.

Customer chatbot
Autonomous agent
RAG pipeline
CLI / script
1. request
4. response
Routing layer
HiWay2LLM
Smart routing
Picks the cheapest capable model per request.
BYOK vault
Your provider keys, AES-GCM encrypted per workspace.
0% markup
Providers bill you directly. We take nothing on inference.
Guardian
Anti-loop + burn-rate kill-switch before a bad call ships.
Sub-millisecond routing
< 1 ms
2. routed
3. stream
AnthropicBYOK
OpenAIBYOK
GoogleBYOK
MistralBYOK
GroqBYOK
xAIBYOK
40-60%
typical savings vs always-flagship
0%
markup on inference - ever
< 1 ms
routing decision latency
10+
providers supported, OpenAI-compatible API

200+ models · LLM · Image · Video · Audio · Embeddings - all via BYOK

OpenAI
Anthropic
Google
Mistral
DeepSeek
Groq
xAI
Perplexity
Meta Llama
Cohere
Together AI
Azure OpenAI
Stability AI
BFL / Flux
fal.ai
Kling
Runway
Luma AI
ElevenLabs
HeyGen
Voyage AI
Fireworks
Replicate
OpenAI
Anthropic
Google
Mistral
DeepSeek
Groq
xAI
Perplexity
Meta Llama
Cohere
Together AI
Azure OpenAI
Stability AI
BFL / Flux
fal.ai
Kling
Runway
Luma AI
ElevenLabs
HeyGen
Voyage AI
Fireworks
Replicate

Get started in 3 steps

From signup to your first routed request in under 2 minutes.

1

Sign up

Create an account in 30 seconds. Email + password, free tier activates immediately - 2M tokens/month, no credit card.

@
G
f
2

Add your provider keys

Plug in your own keys for any supported provider - LLM (Anthropic, OpenAI, Google, Mistral, Groq…), image (Flux, Stability AI, fal.ai), video (Kling, Runway), audio (ElevenLabs) or embeddings (Cohere, Voyage AI). They stay encrypted on our side and you keep billing with your providers directly. Zero markup on inference.

Apr 12$100
Apr 8$25
3

Change one line. Ship.

Point your SDK's base_url at HiWay2LLM. One endpoint reaches every model you've enabled, and the router picks the cheapest model that can handle each request. OpenAI-compatible. Works with any SDK.

HIWAY_API_KEY
••••••••••••••

Change one line. Save 50%.

Point your existing code to HiWay2LLM. We handle the rest.

app.py
from openai import OpenAI
client = OpenAI(base_url="https://api.anthropic.com/v1")
client = OpenAI(base_url="https://app.hiway2llm.com/v1")
# That's it. Same code. 50% cheaper.

Light

Haiku 4.5 / GPT-4o-mini / Gemini 2.5 Flash Lite

65% of requests

Standard

Sonnet 4.6 / GPT-4o / Gemini 2.5 Flash

28% of requests

Heavy

Opus 4.7 / GPT-5 / Gemini 2.5 Pro

7% of requests

Not just routing. Intelligence.

7 analyzers, burn-rate alerting, multi-provider optimization - and CORTEX, the AI that self-tunes your router while you ship.

< 1ms Smart Routing

7 analyzers detect intent, complexity, tools, and code in under a millisecond. No LLM call for routing - pure CPU.

Control Layer - Anti-drift

Baseline every agent, detect prompt inflation, silent escalations to premium models and pricing drift. Alerts, rollback, per-agent budgets. Built for CTOs who want total control of their LLM spend.

Burn-rate Alerting

We watch your spend in real time. Burn-rate thresholds, anomaly detection, and per-key alerts fire the moment something looks off - before your monthly bill does.

Advanced Budget Controls

No LLM provider offers this. Set daily/monthly caps, per-model limits, off-hours rules, and automatic degradation.

Usage Reporting

Per-user CSV exports, daily breakdowns by model, token-level cost attribution. Plug it into your invoicing or your accounting in two clicks.

200+ Models, Every Modality

Bring your own keys from any provider - LLM (Anthropic, OpenAI, Google, Mistral, Groq, Together AI, Replicate…), image (Flux, Stability AI, fal.ai), video (Kling, Runway, Luma), audio (ElevenLabs, HeyGen), and embeddings (Cohere, Voyage AI). One API, all modalities.

1 Line Integration

Change your base_url. That's it. Compatible with any LLM SDK - OpenAI, Anthropic, LangChain, Vercel AI, n8n.

Zero Prompt Logging

Your prompts never touch our disk. Architectural guarantee. GDPR and EU AI Act compliant.

CORTEX AI Orchestrator

Proactive AI that reads Guardian events, auto-tunes routing thresholds, and pushes insights to your CORTEX Inbox - so you see problems before your users do. Scale and Enterprise.

Enterprise Security

Prompt security built in.

Two-tier scanner catches injection, jailbreaks, PII leaks, and secrets in under 2 ms, before they reach the model. Zero latency in monitor mode.

Prompt injection

Blocks "ignore all previous instructions", DAN mode, developer mode, and persona-override patterns.

Prompt extraction

Catches attempts to read your system prompt or internal instructions.

Jailbreak

Stops requests for malware, exploits, synthesis of controlled substances, and illegal content.

PII detection

Flags email addresses, phone numbers, IBANs, and tax identifiers before they reach the LLM, GDPR compliant.

Secret leakage

Catches API keys (OpenAI, Anthropic, GitHub PAT, Bearer tokens) accidentally pasted into prompts.

<2ms
Tier-1 scan latency
5
Threat types
100%
Uptime guarantee
SOC 2
Ready audit trail
Tier-1 regex scan < 2 ms, always on
Tier-2 LLM Guard NLP (optional, lazy-loaded)
Immutable audit trail (tamper-proof DB trigger)
SIEM webhook export (Splunk, Datadog, custom)
Read the Security Shield docs
Open source · MIT

Ship with an SDK. Today.

30-second CLI, OpenAI-compatible Python + TypeScript SDKs. Zero vendor lock-in - you can leave HiWay without touching a line of application code.

Recommended

CLI

One-line install, signup from the terminal, first call without writing code. Perfect to kick the tires before integrating.

npm i -g @hiway2llm/cli
hw signup
hw chat "explain this in 3 bullets"

Python

Drop-in import. Every method from the OpenAI SDK works - we just route to the right model.

pip install hiway2llm

from hiway2llm import Hiway
cli = Hiway(api_key="hw_live_...")
cli.chat("Say hi")

TypeScript

Native fetch client, works in Node and Edge runtimes (Vercel, Cloudflare Workers).

npm i @hiway2llm/client

import { Hiway } from "@hiway2llm/client";
const h = new Hiway({ apiKey: "hw_live_..." });
await h.chat("Say hi");

Simple plans. Your keys, our brain.

Keep your Anthropic key, pay Anthropic directly. HiWay measures consumption and bills a % markup on the actual routed cost - fully offset by routing savings.

Start free. Scale when you're ready.

No credit card required · Cancel anytime · Instant access

Estimez votre économie réelle

Routage intelligent − frais HiWay2LLM = gain net

Budget API mensuel
$1kScale
$100$50k+

Profil d'usage

Mix estimé : 40% Haiku · 50% Sonnet · 10% Opus

Économie nette / mois

+$501

soit +50% sur ta facture actuelle

Avant HiWay2LLM$1k / mois
Économies smart routing$550
Markup HiWay2LLM (11% du routé)+$49
Total après HiWay2LLM$499 / mois
Projection 12 mois+$6.0k économisés
Démarrer gratuitement

Simulation indicative · basée sur le mix modèles typique de votre profil

Free

Gratuit

Pour tester et prototyper.

Routage intelligent (toutes sources)
Dashboard analytics basique
1 clé API
Zéro journalisation des prompts
Guardian anti-dérive
CORTEX Orchestrateur IA
Contrôles budgétaires
Cache sémantique
Masquage PII
Démarrer gratuitement
Populaire

Scale

jusqu'à

−60%

sur tes coûts IA réels · CORTEX route vers le modèle optimal

Smart routing LLMbon modèle au bon moment
−30 à −60%
Cache sémantiquetokens évités
−10 à −20%
Guardian anti-dériverequêtes inutiles bloquées
−5 à −15%
Markup HiWay2LLM+10 à 12,5%

Dégressif : <$500 → 12,5% · $500-5K → 11% · $5K-20K → 10%

Tout FREE inclus
Guardian anti-dérive avancé
CORTEX Orchestrateur IA
Contrôles budgétaires avancés
Cache sémantique
Masquage PII
Sessions agents multi-tenant
Rapports d'usage exportables (CSV)
Support prioritaire
Démarrer

Enterprise

Sur mesure

$20K-50K/mois → 9% · au-delà : sur-mesure négocié

VolumeNégocié
SLA dédiéInclus
Contrat annuelPossible
Tout Scale inclus
Markup négocié selon volume
SLA dédié & uptime garanti
Contrat annuel possible
Support dédié (Slack privé)
Intégrations sur mesure
Nous contacter
Ta clé Anthropic, tu paies Anthropic directement
HiWay2LLM mesure la conso et facture le markup
Wallet vide = passthrough, service continu
Résiliation immédiate

Ce qui est inclus

Toutes les fonctionnalités core sont disponibles dès le premier pack. Les features avancées s'ouvrent avec Scale et Enterprise.

Fonctionnalité
FreeRoutage de base · 10M/mois
ScaleMarkup 12,5 → 10%
EnterpriseSur devis
USAGE & QUOTAS
Tokens incluspar pack acheté1B / achatcustom
Auto-reload
Sièges équipe325
Workspaces15
Conservation analytics30j1 an
MOTEUR DE ROUTAGE
Smart routing (model=auto)
BYOK fournisseurs
0 % marge sur l'inférence
Fallback automatique
Guardian anti-loop
CORTEX alertes Inbox
CONTRÔLES AVANCÉS
Cache sémantique
A/B testing modèles
Journal d'audit
CORTEX complet (5 phases)
SSO (Google, Microsoft)
Masquage PII
Self-hosted
Règles routage custom
SUPPORT & CONFORMITÉ
Canal de supportEmailPrioritySLA 99.99%
DPA (RGPD)
Financement disponible
Ingénieur dédié

L'inférence est toujours facturée directement par vos fournisseurs LLM, sur vos propres clés. Les prix ci-dessus n'incluent pas l'inférence.

EVERY PLAN INCLUDES

Smart routing across all your BYOK providers
Burn-rate alerting & anomaly detection
Real-time dashboard, per-key analytics
Multi-tenant support, per-key rate limits
Zero prompt logging (GDPR-ready)
OpenAI-compatible API - works with any SDK

BYOK - bring your own keys from any supported provider: LLM (Anthropic, OpenAI, Google, Mistral, Groq, Together AI, Replicate, Cohere…), image (Flux/BFL, Stability AI, fal.ai), video (Kling, Runway, Luma AI), audio (ElevenLabs, HeyGen). Inference is billed directly by your providers. HiWay only charges a % markup on the actual routed cost.

Stop overpaying for
"bonjour"

Your users send simple messages 70% of the time. Why pay Opus prices for a greeting?

Start free

Compared to OpenRouter, Portkey, LiteLLM

Honest side-by-side. Updated 2026-04-22 against each vendor's public docs.

FeatureHiWay2LLMOpenRouterPortkeyLiteLLMRequesty
Bring your own keys (BYOK)
Smart routing by request complexity
OpenAI-compatible API
Automatic fallback across providers
Prompt caching (Anthropic / OpenAI)
Per-workspace analytics + audit log
Burn-rate alerts (budget spikes)
EU hosting by default (GDPR)
self-host
Zero prompt logging
AI self-management (CORTEX)
Pricing model
flat €/mo
% markup
flat + % markup
self-host / SaaS
% markup

native · partial / plugin · not offered. We check these claims against each vendor's public docs - if you spot an inaccuracy, tell us.

Frequently Asked Questions

How does HiWay2LLM reduce my costs?
Most LLM requests don't need the most powerful (and expensive) model. A simple "hello" doesn't need Claude Opus 4.7 at $25/M output tokens - Haiku 4.5 at $5/M handles it perfectly. HiWay2LLM analyzes every request in under 1 millisecond and routes it to the cheapest model in your BYOK roster that can handle it. On typical mixes, customers save 40-60% without changing their code or prompts.
Will the quality of responses decrease?
No. HiWay2LLM only routes simple requests (greetings, short questions, confirmations) to cheaper models. Complex tasks - code generation, multi-step reasoning, agentic tool use - still go to the most powerful models. You can also override routing at any time with the X-Force-Model header if you need a specific model for a request.
How long does it take to integrate?
About 2 minutes. You change one line of code - your base_url. That's it. HiWay2LLM is compatible with any LLM SDK: OpenAI, Anthropic, LangChain, Vercel AI SDK, n8n, curl, and anything that speaks the standard API format. No SDK to install, no config file to maintain.
What LLM providers are supported?
Anthropic (Haiku 4.5, Sonnet 4.6, Opus 4.7), OpenAI (GPT-4o-mini, GPT-4o, GPT-5), Google (Gemini 2.5 Flash Lite, Flash, Pro), Mistral (Small, Large), and DeepSeek (V3, R1). You plug in your own keys for the providers you want to use - HiWay2LLM automatically picks the best price/quality for each request across your enabled set.
Do you store my prompts or responses?
No. Zero prompt logging is a core architectural principle, not just a policy. Your prompts pass through our routing proxy in memory only, are forwarded to the LLM provider, and immediately discarded. No prompt data is ever written to disk. We only store metadata: token counts, model selected, cost, and routing latency.
How does pricing work?
Token packs with three billing modes - Free (2M tokens/mo, no card), Spark ($5.50 once · $5.25/mo · $59.40/yr, 10M tokens), Boost ($25 once · $23.75/mo · $270/yr, 50M tokens), Pro ($85 once · $80.75/mo · $918/yr, 200M tokens), Scale ($360 once · $342/mo · $3,888/yr, 1B tokens), Enterprise on request. Inference is billed separately by your LLM providers on your own accounts - HiWay2LLM applies zero markup. You can switch packs or cancel any time from the dashboard.
What happens when my costs spike?
HiWay2LLM watches your spend in real time and fires burn-rate alerts when a key, agent or workspace drifts above baseline. You get email + Slack notifications the moment something looks off - before the monthly bill does. You set the thresholds; we surface the signal.
What if HiWay2LLM goes down?
We target 99.9% uptime. If our routing proxy is unavailable, your requests will fail with a clear error (502). We recommend implementing a simple fallback in your code that routes directly to your provider if HiWay2LLM is unreachable. This takes 3 lines of code.
Can I force a specific model for certain requests?
Yes. Add the X-Force-Model header to any request to bypass smart routing. For example: X-Force-Model: anthropic/claude-opus-4-7 will always use Opus 4.7 regardless of the complexity score. Useful for critical requests where you always want the best model.
Is this GDPR compliant?
Yes. We're a French company (Hiway2llm.com) hosted on EU servers (OVH, France). We don't store personal data beyond your email. We don't store prompts. We comply with GDPR and the EU AI Act. A Data Processing Agreement (DPA) is available for enterprise clients.
How does this compare to OpenRouter?
OpenRouter is a multi-provider API gateway - you manually choose which model to use. HiWay2LLM is a smart router - it automatically picks the best model for each request based on complexity analysis. OpenRouter adds cost (their fee + no routing savings). HiWay2LLM saves cost (routing to cheaper models offsets the flat subscription fee).
Can I self-host HiWay2LLM?
We offer a fully managed SaaS - no infrastructure to maintain. For enterprise clients with specific compliance or data residency requirements, we offer private deployment options. Contact us to discuss.