BYOK LLM Gateway - what it is and why it matters

BYOK stands for Bring Your Own Keys. A BYOK LLM gateway is a routing layer where you hand it the provider API keys you already own - your OpenAI key, your Anthropic key, your Google key - and it calls those providers on your behalf. The inference itself is billed by the providers directly on your account, at their wholesale rate. The gateway charges you separately for the routing, observability, and safety layer on top - typically a subscription or a degressive markup on your API spend.

This sounds like a small architectural detail. It is not. It is the biggest shift happening in LLM infrastructure right now, because it realigns the incentives between you and the middle layer.

BYOK vs reseller - the concrete difference

Most LLM gateways you hear about in 2026 are one of two things, even if the marketing blurs the line.

A reseller gateway holds its own accounts with OpenAI, Anthropic, Google and friends. You top up a balance with the gateway, and every request you make, the gateway charges your balance at its own rate. That rate sits on top of the provider's wholesale rate - typically a single-digit percentage markup on every token. No subscription, pure pay-as-you-go. OpenRouter is the canonical example.

A BYOK gateway does not hold accounts with the upstream providers. You hold those accounts. You give the gateway your API keys (encrypted at rest), and the gateway routes your requests through your keys. Inference shows up on your OpenAI/Anthropic/Google invoice, at the price you already negotiated. The gateway charges a degressive markup % on your monthly API spend - transparent, shrinking as volume grows, not a per-token reseller cut.

Mechanically both products let you send a request and get a response. Financially they are opposite animals.

	Reseller gateway	BYOK gateway
Who holds provider accounts	The gateway	You
Who appears on provider invoice	The gateway	You
Pricing model	Per-token markup (~5%)	Degressive markup on your API spend
Your cost when you scale	Grows linearly with usage	Grows with spend, but rate drops as volume rises
Incentive of the gateway	More of your tokens = more revenue	More of your tokens = same rate, lower %
Who absorbs provider price changes	The gateway (good)	You (transparent)
Who gets the enterprise pricing you negotiated	The gateway	You

That last row is the one enterprise teams notice first. If your company has a committed spend agreement with Anthropic that gives you 15% off list, a reseller cannot use that pricing - they route through their own account at their own negotiated rate. A BYOK gateway routes through yours and preserves the discount.

Why the incentive alignment matters

Here is the uncomfortable truth about reseller gateways: they make money when you spend more. That means every feature that reduces your spend - smart routing to cheaper models, aggressive caching, budget caps, auto-downgrade rules - is a feature that eats their own revenue.

Some reseller products still ship those features because they are good citizens and they know buyers want them. But there is a gravity pulling them the other way. When an engineering team inside the reseller chooses between "build feature A that helps the customer save 10%" or "build feature B that gives us 10% more throughput", A wins half the time, not every time.

A BYOK gateway has the opposite gravity. Every dollar of inference it helps you save is a dollar of value it can point to on renewal. Cost controls, smart routing, prompt caching, burn-rate alerts - these are the product, not features in tension with the product. That is why BYOK gateways tend to ship them first and ship them harder.

This is not a moral argument. It is a structural one. Two years of watching the managed-LLM SaaS wave has made it clear: the products that charge a percentage of inference are systematically slower to ship the cost-reduction features customers keep asking for. That is not because the people building them are bad. It is because they are rational, and they are shipping the features that make money.

When BYOK is the right fit

BYOK is not universally better. It has real trade-offs.

BYOK wins when:

You already have accounts with the main providers and want to pay them directly at wholesale.
You want smart routing to trim the inference bill (40-85% typical on a mixed workload) - that savings is volume-independent and stacks on top of the gateway's degressive markup.
You have negotiated enterprise rates you want to preserve.
You care about the incentive alignment - you want the routing layer to help you spend less, not more.
Compliance / procurement wants a clear separation between the infrastructure vendor and the model vendor.

Reseller wins when:

You are prototyping and do not want to sign up with five different providers to try five different models.
You want pure pay-as-you-go with zero fixed cost and your API spend is near zero - the Free plan on HiWay covers basic usage with no markup at all.
You want access to niche or community-hosted models (Together, Fireworks, DeepInfra, open-source finetunes) that you cannot reach with direct provider accounts.
Setup speed matters more than anything else.

The honest exception: if your inference spend is near zero, a reseller with a tiny absolute markup may be mechanically cheaper. The HiWay Free plan covers that case with 0% markup, no subscription required. Beyond that edge, BYOK + smart routing wins regardless of volume - the inference savings dwarf the markup within hours of real use.

Which gateways offer BYOK in 2026

Per each product's public documentation as of 2026-04-22, the BYOK landscape looks like this:

HiWay2LLM - BYOK-native. Degressive markup on API spend (9-12.5% by volume), smart routing across providers, EU-hosted.
Portkey - BYOK-first. Flat subscription. Strong observability focus, with a model router on top.
LiteLLM - BYOK in both the OSS library and the Cloud product. You self-host OSS with your own keys, or run Cloud and bring your keys.
Vercel AI Gateway - offers BYOK mode. Sits on Vercel's edge. Most natural for teams already on Vercel.
Cloudflare AI Gateway - BYOK by default; Cloudflare does not resell tokens, it routes your requests through your keys with caching and observability on top.
Requesty - BYOK-first routing gateway with a focus on cost optimization.
OpenRouter - NOT BYOK. Reseller model with per-token markup. Accepted as the standard alternative model; no BYOK mode at time of writing.
Helicone - primarily an observability product. BYOK for the proxying; pricing is on logged requests, not inference markup.

The pattern: the gateways built after 2024 are mostly BYOK. The ones that were built when LLMs were still a novelty - when "giving you access" was the actual product - are mostly reseller. The market is moving in the BYOK direction, and the reseller model is increasingly defended as "convenience for prototyping" rather than "the right way to run production".

BYOK vs reseller - side by side

Feature	HiWay2LLM	Reseller gateways
You hold the provider accounts
Degressive markup on your spend (not per-token reseller cut) Reseller margins sit on top of every token
Your enterprise rate is preserved
Incentive to help you spend less Reseller revenue grows with your spend
Provider price drops hit you immediately Resellers can pocket the delta
Time to first call Reseller wins on setup speed	~5 min	~2 min
Access to community/niche providers Resellers often aggregate broader catalogs
Predictable monthly cost

native · partial or plugin · not offered

What a good BYOK gateway actually ships

A passthrough proxy with a billing page is not a gateway. A serious BYOK gateway should give you at least five things, because the incentive to ship them is now aligned with yours:

Smart routing across models. Read the request in under a millisecond, score its difficulty, pick the cheapest model capable of answering. Not every call needs the top tier.
Prompt caching. Anthropic and OpenAI both expose caching APIs. The gateway should use them automatically and report the hit rate.
Burn-rate alerts. If an agent starts spending $500/hour at 3am, you want to know before morning, not at the next invoice.
Per-workspace audit log. Who called what model, when, with which key, for how many tokens, at what cost. Exportable.
Automatic fallback. When one provider is down or rate-limited, route to the next-best model. Configurable by workspace, not hardcoded.

If a BYOK gateway does not ship these, it is a thin proxy with a marketing page, not infrastructure. The test is simple: can you name five decisions the gateway makes for you that you would otherwise have to build yourself? If the answer is fewer than three, you are overpaying for a wrapper.

HiWay in this landscape

HiWay is a BYOK gateway with smart routing as its core bet. The five items above are the product. Free plan to start, Scale with degressive markup (9-12.5% by volume - automatic). EU-hosted on OVH, zero prompt logging by default, DPA on every plan.

The positioning is simple: if you want the routing layer's incentives aligned with your bill going down - smart routing, degressive markup, BYOK, EU hosting - HiWay is built for that. If you just want one key to prototype with 100+ models in two minutes and you do not care about those levers, a reseller model is likely the right call. The two are not in direct competition; they solve different problems.

FAQ

Frequently asked questions

In a well-built BYOK gateway, keys are encrypted at rest with AES-GCM, the master key is rotated independently, and keys are never logged in cleartext. They are decrypted in memory only when a request is in flight, then discarded. HiWay specifically publishes its key-handling posture and does not persist prompts alongside keys, so even in a worst-case breach the two things are not co-located.

Bottom line

BYOK is not a feature - it is a category. It separates buying intelligence from operating intelligence. It realigns the infrastructure layer's incentives with yours. And it future-proofs your stack against a market where token prices keep falling and the margin in the middle is getting harder to defend.

If you are on a reseller gateway today, the math almost always tilts toward BYOK + smart routing - the savings are volume-independent, not gated on a spend threshold. If you are starting out, BYOK is still worth understanding now, so you know what you are building on.

Try HiWay - BYOK, degressive markup, EU-hosted

Free plan to start, no credit card required