LLM Gateway comparison — 12 tools on 10 criteria

There are too many LLM gateways for most teams to evaluate one by one. The space got crowded fast: every cloud, every observability vendor, every AI tooling startup has shipped a gateway. The tooling overlaps. The differentiators are subtle. And the marketing pages all look the same.

This page is a single-shot matrix: 12 tools that come up in most shortlists, compared on 10 criteria that actually differ between them. The goal is to cut the research time from a week to an afternoon. Everything below is per each tool's public documentation as of 2026-04-22 — if a cell looks wrong to you, tell us and we will fix it.

The 12 tools

The field is wider than this, but these twelve show up in almost every comparison we see in practice:

HiWay2LLM — BYOK router with model-level smart routing, EU-hosted.
OpenRouter — reseller-model aggregator with a huge catalog, US-hosted.
LiteLLM (OSS) — the self-hosted Python library that became the de facto standard.
LiteLLM Cloud — the managed version of the library.
Vercel AI Gateway — BYOK gateway on Vercel's edge, tight AI SDK integration.
Portkey — BYOK-first gateway with deep observability, enterprise posture.
Helicone — observability-first, gateway as a side-effect of the proxy.
Cloudflare AI Gateway — edge-native, BYOK, caching + analytics on Cloudflare.
LangSmith — LangChain's observability + evaluation suite; gateway-adjacent.
Requesty — BYOK cost-optimization gateway, smaller and newer.
Martian — research-originated routing gateway focused on model-quality/cost tradeoffs.
Unify — router positioned on dynamic provider selection with benchmarks.
Kong AI Gateway — the AI layer of Kong's API gateway; enterprise-API-gateway DNA.

The 10 criteria

These are the axes that matter in real procurement conversations:

Pricing model — reseller (% markup on tokens), flat subscription, usage-based on proxied requests, or free/OSS.
BYOK — do you bring your own provider keys, or does the gateway hold them?
Smart routing type — none, provider-fallback (same model, different upstream), or model-level (pick a different model based on request difficulty).
EU hosting — is there a real EU-only data path, including the control plane?
OpenAI-compatible API — can your existing OpenAI SDK code point at it with just a base_url swap?
Prompt logging default — are prompts logged by default, or is logging off by default?
DPA available — can you sign an Article 28 DPA on any plan (yes / enterprise-only / no)?
Burn-rate alerts — does it proactively alert when spend spikes?
Model catalog size — rough count of supported models.
Primary job — what the product is actually optimized for: routing, observability, or edge/API-management.

The matrix

Linked competitor names go to our deeper /compare/<slug> page where one exists.

Tool	Pricing	BYOK	Smart routing	EU hosting	OpenAI-compat	Prompt log default	DPA	Burn-rate alerts	Catalog	Primary job
HiWay2LLM	Flat subscription, 0% token markup	Yes	Model-level	Yes (OVH, France)	Yes	Off	Every plan	Yes	60+	Routing (cost)
OpenRouter	Reseller, ~5% markup	No	Provider-fallback	No (US)	Yes	On	Enterprise	No	300+	Routing (breadth)
LiteLLM OSS	Free (self-host)	Yes	Provider-fallback	Self-host, you choose	Yes	Off (by config)	N/A	Via plugins	100+	Routing (self-hosted)
LiteLLM Cloud	Subscription + usage	Yes	Provider-fallback	US primary	Yes	Configurable	Enterprise	Via integrations	100+	Routing (managed)
Vercel AI Gateway	Usage-based on proxied requests	Yes	Provider-fallback	Edge has EU PoPs, control plane US	Yes	Configurable	Enterprise	Partial	40+	Routing (Vercel-native)
Portkey	Flat subscription + usage	Yes	Model-level + provider-fallback	US primary, EU on enterprise	Yes	Configurable	Higher tiers	Yes	200+	Routing + observability
Helicone	Usage-based on logged requests	Yes (proxy)	Provider-fallback	US	Yes	On (core product)	Enterprise	Partial	100+	Observability
Cloudflare AI Gateway	Usage-based	Yes	Provider-fallback	Global edge, regional via add-on	Yes	Configurable	Enterprise	Partial	50+	Edge (caching + analytics)
LangSmith	Subscription + usage	Yes (tracing)	N/A	US	N/A (traces LangChain)	On (observability)	Enterprise	No	N/A	Observability + eval
Requesty	Subscription	Yes	Model-level	US	Yes	Off	Enterprise	Partial	80+	Routing (cost)
Martian	Subscription	Yes	Model-level (benchmark-driven)	US	Yes	Off	Enterprise	No	50+	Routing (quality/cost)
Unify	Usage-based	Yes	Model-level (benchmark-driven)	US/UK	Yes	Configurable	Enterprise	No	100+	Routing (benchmark-selection)
Kong AI Gateway	Enterprise (Kong platform)	Yes	Provider-fallback	Self-host, you choose	Yes	Off (by config)	Enterprise	Via plugins	40+	API gateway (AI extension)

A couple of notes on how to read it:

"Provider-fallback" means the gateway routes to a different upstream host of the same model if one is down. "Model-level" means the gateway can pick a different model based on the request (cheaper, faster, better-suited). These are different features. Both are useful. They are not the same.
"Prompt log default: off" does not mean the gateway cannot log — it means the default is off and you turn it on if you need it. That posture matters under GDPR because data you do not store is data you cannot leak.
"DPA: every plan" vs "DPA: enterprise" is a real procurement friction. A DPA on every plan means you can onboard a compliance-aware customer without a sales cycle.

How to use the matrix

Three honest ways to narrow it down.

Start with the primary job column. If you need observability, most of the routing-first tools are the wrong starting point (and vice versa). Helicone and LangSmith are observability products. Cloudflare and Vercel are edge/platform-native. HiWay, Portkey, Requesty, Martian, Unify are routing-first. LiteLLM is routing-as-a-library. OpenRouter is routing-as-an-aggregator. Kong is an API gateway that acquired an AI layer. Knowing which job you are buying cuts the list by half.

Then filter on pricing model. Reseller vs flat subscription vs usage-based vs OSS are fundamentally different relationships. If you have an internal mandate to avoid percentage-of-spend pricing, OpenRouter is out. If you want zero fixed cost, everything with a subscription is out. This usually takes the list from six to three.

Then filter on the compliance columns that apply to you. EU hosting, prompt logging default, DPA availability — these are yes/no questions for most buyers. If EU hosting is mandatory, the list collapses to HiWay, LiteLLM self-hosted, and Kong self-hosted, plus the enterprise-EU-deployment options on Portkey and Vercel. That is usually a short enough list to demo.

Three filters, ten minutes, shortlist. The rest is detail.

What the matrix cannot tell you

A wide table like this is a decent first screen, but there are three things it hides.

1. Maturity of the routing logic. "Model-level smart routing" covers a spectrum from "a rule you hand-write in config" to "a scoring model trained on millions of requests". The quality difference is huge, and it only shows up in production. Demo the top 2–3 candidates with your own traffic.

2. Support posture. A gateway from a 10-person startup and a gateway from Cloudflare behave differently when something breaks at 3am. Neither is automatically better — startups ship faster, large vendors are more stable — but the ops profile is different.

3. Vendor lock-in shape. "OpenAI-compatible API" reduces lock-in a lot, but not to zero. Workspace primitives, analytics schemas, fallback rule syntax — these are all vendor-specific. Migrating is usually an afternoon, but "usually" is doing work in that sentence.

Use the matrix to shortlist. Use a one-week POC with real traffic to pick the winner.

Side-by-side deep-dives

The /compare/<slug> pages go deeper than this page ever can — pricing math, migration code, feature-by-feature comparisons, FAQ per competitor. The ones currently published:

More are on the roadmap. If there is a specific comparison you need and we do not have it, tell us.

Bottom line

Twelve tools, ten criteria, and the real decision almost always comes down to three filters: what job you are hiring the gateway for, what pricing model fits your cost curve, and which compliance cells you need to tick. Everything else is useful detail once the shortlist is three or four products — but it is noise if you try to evaluate all twelve on every axis in parallel.

Try HiWay — BYOK, flat pricing, EU-hosted

2,500 requests/mo free, model-level smart routing, no credit card