Now Live· Smart LLM Routing · BYOK

Use the best model.
Pay for the cheapest.

HiWay2LLM analyzes every request in <1ms and routes it to the optimal model across your own API keys. Simple messages go to cheap models. Complex tasks go to powerful ones. You save 40-60% on typical mixes, at zero markup.

<1ms

Routing latency

0%

Inference markup

0

Prompts stored

5

Providers supported

How it fits together

One thin layer between your app and the models

HiWay2LLM sits between your code and the LLM providers. Your keys. Your data. Our routing intelligence.

Customer chatbot
Autonomous agent
RAG pipeline
CLI / script
1. request
4. response
Routing layer
HiWay2LLM
Smart routing
Picks the cheapest capable model per request.
BYOK vault
Your provider keys, AES-GCM encrypted per workspace.
0% markup
Providers bill you directly. We take nothing on inference.
Guardian
Anti-loop + burn-rate kill-switch before a bad call ships.
Sub-millisecond routing
< 1 ms
2. routed
3. stream
AnthropicBYOK
OpenAIBYOK
GoogleBYOK
MistralBYOK
GroqBYOK
xAIBYOK
40-60%
typical savings vs always-flagship
0%
markup on inference — ever
< 1 ms
routing decision latency
10+
providers supported, OpenAI-compatible API

Get started in 3 steps

From signup to your first routed request in under 2 minutes.

1

Sign up

Create an account in 30 seconds. Email + password, free tier activates immediately — 2.5K routed requests/month, no credit card.

@
G
f
2

Add your provider keys

Plug in your own Anthropic, OpenAI, Google, Mistral or DeepSeek API keys. They stay encrypted on our side and you keep billing with your providers directly. Zero markup on inference.

Apr 12$100
Apr 8$25
3

Change one line. Ship.

Point your SDK's base_url at HiWay2LLM. One endpoint reaches every model you've enabled, and the router picks the cheapest model that can handle each request. OpenAI-compatible. Works with any SDK.

HIWAY_API_KEY
••••••••••••••

Change one line. Save 50%.

Point your existing code to HiWay2LLM. We handle the rest.

app.py
from openai import OpenAI
client = OpenAI(base_url="https://api.anthropic.com/v1")
client = OpenAI(base_url="https://app.hiway2llm.com/v1")
# That's it. Same code. 50% cheaper.

Light

Haiku 4.5 / GPT-4o-mini / Gemini 2.5 Flash Lite

65% of requests

Standard

Sonnet 4.6 / GPT-4o / Gemini 2.5 Flash

28% of requests

Heavy

Opus 4.7 / GPT-5 / Gemini 2.5 Pro

7% of requests

Not just routing. Intelligence.

7 analyzers, burn-rate alerting, and multi-provider optimization — built for production with your own keys.

< 1ms Smart Routing

7 analyzers detect intent, complexity, tools, and code in under a millisecond. No LLM call for routing — pure CPU.

Control Layer — Anti-drift

Baseline every agent, detect prompt inflation, silent escalations to premium models and pricing drift. Alerts, rollback, per-agent budgets. Built for CTOs who want total control of their LLM spend.

Burn-rate Alerting

We watch your spend in real time. Burn-rate thresholds, anomaly detection, and per-key alerts fire the moment something looks off — before your monthly bill does.

Advanced Budget Controls

No LLM provider offers this. Set daily/monthly caps, per-model limits, off-hours rules, and automatic degradation.

Usage Reporting

Per-user CSV exports, daily breakdowns by model, token-level cost attribution. Plug it into your invoicing or your accounting in two clicks.

Multi-Provider Optimization

Anthropic, OpenAI, Google, Mistral, DeepSeek — we pick the best price/quality on each request across all the providers you've enabled.

1 Line Integration

Change your base_url. That's it. Compatible with any LLM SDK — OpenAI, Anthropic, LangChain, Vercel AI, n8n.

Zero Prompt Logging

Your prompts never touch our disk. Architectural guarantee. GDPR and EU AI Act compliant.

Open source · MIT

Ship with an SDK. Today.

30-second CLI, OpenAI-compatible Python + TypeScript SDKs. Zero vendor lock-in — you can leave HiWay without touching a line of application code.

Recommended

CLI

One-line install, signup from the terminal, first call without writing code. Perfect to kick the tires before integrating.

npm i -g @hiway2llm/cli
hw signup
hw chat "explain this in 3 bullets"

Python

Drop-in import. Every method from the OpenAI SDK works — we just route to the right model.

pip install hiway2llm

from hiway2llm import Hiway
cli = Hiway(api_key="hw_live_...")
cli.chat("Say hi")

TypeScript

Native fetch client, works in Node and Edge runtimes (Vercel, Cloudflare Workers).

npm i @hiway2llm/client

import { Hiway } from "@hiway2llm/client";
const h = new Hiway({ apiKey: "hw_live_..." });
await h.chat("Say hi");

Simple plans. Your keys, our brain.

Pay for routing intelligence, not for inference. Inference is billed by your own providers at their published prices — we never touch it.

Free

$0/mo

2.5K routed requests / month

Try it with your own keys. No credit card. Upgrade when you're ready.

Start free

Build

$15/mo

100K routed requests / month

Typical savings 40-60%

For solo devs and side-projects. All routing, all providers, real-time alerts.

Subscribe
Popular

Scale

$39/mo

500K routed requests / month

Typical savings 40-60%

For production workloads. Higher quotas, priority support, advanced controls.

Subscribe

Business

$249/mo

5M routed requests / month

Typical savings 40-60%

For teams shipping at scale. SSO, audit log, per-agent budgets, dedicated Slack.

Subscribe

Compare plans

The full detail. Quotas, advanced features, support tiers, side by side.

Feature
Free$0 /mo
Build$15 /mo
Scale$39 /mo
Business$249 /mo
EnterpriseOn request
Usage & limits
Routed requests / month2,500100,000500,0005,000,00025M+
Burst limit30/min60/min300/min600/min
Overage price$8/100k$5/100k$3/100kcustom
Team seats131025
Workspaces113
Analytics retention7j30j90j1 an
Routing engine
Smart routing (model=auto)
BYOK provider vault
0% markup on inference
Automatic provider fallback
Guardian anti-loop + kill-switch
Advanced controls
Semantic cache
A/B testing on models
Audit logs
SSO (Google, Microsoft)
PII masking
Self-hosted option
Custom routing rules
Support & compliance
Support channelCommunityEmailPrioritySLA 99.9%SLA 99.99%
DPA (GDPR)
Dedicated support engineer

Inference is always billed to you directly by the LLM providers on your own keys. Subscription prices above do not include inference.

EVERY PLAN INCLUDES

Smart routing across all your BYOK providers
Burn-rate alerting & anomaly detection
Real-time dashboard, per-key analytics
Multi-tenant support, per-key rate limits
Zero prompt logging (GDPR-ready)
OpenAI-compatible API — works with any SDK

BYOK — you plug in your own Anthropic / OpenAI / Google / Mistral / DeepSeek keys. Inference stays billed by your providers at 0% markup.

How much you'd save

30 €100 €300 €1 000 €3 000 €10 000 €30 000 €
200 /month
Estimated savings (40-60%)80120 /month
Estimated requests20,000 /month
Recommended planBuild (15 €/month)
ROI vs subscription×6.7
Net monthly gain+85
Over 12 months1,020

Estimate based on average 40-60% savings on inference, typical mix 65% light / 28% standard / 7% heavy. Varies with your actual workload.

Stop overpaying for
"bonjour"

Your users send simple messages 70% of the time. Why pay Opus prices for a greeting?

Start free

Compared to OpenRouter, Portkey, LiteLLM

Honest side-by-side. Updated 2026-04-22 against each vendor's public docs.

FeatureHiWay2LLMOpenRouterPortkeyLiteLLMRequesty
Bring your own keys (BYOK)
Smart routing by request complexity
OpenAI-compatible API
Automatic fallback across providers
Prompt caching (Anthropic / OpenAI)
Per-workspace analytics + audit log
Burn-rate alerts (budget spikes)
EU hosting by default (GDPR)
self-host
Zero prompt logging
Pricing model
flat €/mo
% markup
flat + % markup
self-host / SaaS
% markup

native · partial / plugin · not offered. We check these claims against each vendor's public docs — if you spot an inaccuracy, tell us.

Frequently Asked Questions

How does HiWay2LLM reduce my costs?
Most LLM requests don't need the most powerful (and expensive) model. A simple "hello" doesn't need Claude Opus 4.7 at $25/M output tokens — Haiku 4.5 at $5/M handles it perfectly. HiWay2LLM analyzes every request in under 1 millisecond and routes it to the cheapest model in your BYOK roster that can handle it. On typical mixes, customers save 40-60% without changing their code or prompts.
Will the quality of responses decrease?
No. HiWay2LLM only routes simple requests (greetings, short questions, confirmations) to cheaper models. Complex tasks — code generation, multi-step reasoning, agentic tool use — still go to the most powerful models. You can also override routing at any time with the X-Force-Model header if you need a specific model for a request.
How long does it take to integrate?
About 2 minutes. You change one line of code — your base_url. That's it. HiWay2LLM is compatible with any LLM SDK: OpenAI, Anthropic, LangChain, Vercel AI SDK, n8n, curl, and anything that speaks the standard API format. No SDK to install, no config file to maintain.
What LLM providers are supported?
Anthropic (Haiku 4.5, Sonnet 4.6, Opus 4.7), OpenAI (GPT-4o-mini, GPT-4o, GPT-5), Google (Gemini 2.5 Flash Lite, Flash, Pro), Mistral (Small, Large), and DeepSeek (V3, R1). You plug in your own keys for the providers you want to use — HiWay2LLM automatically picks the best price/quality for each request across your enabled set.
Do you store my prompts or responses?
No. Zero prompt logging is a core architectural principle, not just a policy. Your prompts pass through our routing proxy in memory only, are forwarded to the LLM provider, and immediately discarded. No prompt data is ever written to disk. We only store metadata: token counts, model selected, cost, and routing latency.
How does pricing work?
Flat monthly (or annual) subscription for routing intelligence — Free (2.5K req/mo), Build ($15/mo, 100K), Scale ($39/mo, 500K), Business ($249/mo, 5M), Enterprise on request. Inference is billed separately by your LLM providers on your own accounts — HiWay2LLM applies zero markup. You can upgrade, downgrade or cancel any time from the dashboard.
What happens when my costs spike?
HiWay2LLM watches your spend in real time and fires burn-rate alerts when a key, agent or workspace drifts above baseline. You get email + Slack notifications the moment something looks off — before the monthly bill does. You set the thresholds; we surface the signal.
What if HiWay2LLM goes down?
We target 99.9% uptime. If our routing proxy is unavailable, your requests will fail with a clear error (502). We recommend implementing a simple fallback in your code that routes directly to your provider if HiWay2LLM is unreachable. This takes 3 lines of code.
Can I force a specific model for certain requests?
Yes. Add the X-Force-Model header to any request to bypass smart routing. For example: X-Force-Model: anthropic/claude-opus-4-7 will always use Opus 4.7 regardless of the complexity score. Useful for critical requests where you always want the best model.
Is this GDPR compliant?
Yes. We're a French company (Mytm-Group SAS) hosted on EU servers (OVH, France). We don't store personal data beyond your email. We don't store prompts. We comply with GDPR and the EU AI Act. A Data Processing Agreement (DPA) is available for enterprise clients.
How does this compare to OpenRouter?
OpenRouter is a multi-provider API gateway — you manually choose which model to use. HiWay2LLM is a smart router — it automatically picks the best model for each request based on complexity analysis. OpenRouter adds cost (their fee + no routing savings). HiWay2LLM saves cost (routing to cheaper models offsets the flat subscription fee).
Can I self-host HiWay2LLM?
We offer a fully managed SaaS — no infrastructure to maintain. For enterprise clients with specific compliance or data residency requirements, we offer private deployment options. Contact us to discuss.