From direct provider APIs (OpenAI / Anthropic)

Keep your provider account, add a router - smart model selection without rewriting a line of logic.

This guide covers adding HiWay2LLM in front of a codebase that currently calls OpenAI or Anthropic directly. You're not replacing your provider account - you're inserting a thin routing layer that can pick the cheapest capable model per request (model: "auto"), add semantic cache, and give you per-workspace analytics. End-to-end: 5 minutes for the swap itself.

Prerequisites

A HiWay account and your existing OpenAI / Anthropic key configured in Dashboard → Settings → Providers. You keep paying OpenAI / Anthropic directly at wholesale - HiWay just routes on top with your key.

Step 1: Get your HiWay API key

Open Dashboard → Keys and click New key. You'll swap your OPENAI_API_KEY / ANTHROPIC_API_KEY for a hw_live_ key in your app config.

Step 2: Swap your configuration

You were pointing at https://api.openai.com/v1 or https://api.anthropic.com/v1 with the provider's key. Swap the base URL to https://app.hiway2llm.com/v1 and the key to hw_live_.... Your messages, tools, streaming behavior, and error handling stay identical - HiWay speaks the OpenAI wire format and pipes through Anthropic / Google / Mistral behind the scenes. Set model: "auto" to enable smart routing, or keep the exact model id you used (gpt-4o, claude-sonnet-4-5, etc.) to pin and just get observability + cache + fallback without any routing behavior change.

from openai import OpenAI

client = OpenAI(
    base_url="https://app.hiway2llm.com/v1",
    api_key="hw_live_YOUR_KEY",
)

# Option A: let the router pick the cheapest capable model
response = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user", "content": "Summarize this meeting transcript"}],
)

# Option B: keep pinning the exact model, still get cache + fallback + analytics
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Summarize this meeting transcript"}],
)

print(response.choices[0].message.content)

Step 3: Verify

Send one request. Inspect X-HiWay-Routed-Model on the response - if you passed model: "auto", you'll see the router picked something cheaper than your flagship baseline. Open Dashboard → Usage to see your savings vs always routing to Claude Opus 4.7 (our reference flagship).

Gotchas specific to direct provider APIs

Your provider billing doesn't change: OpenAI / Anthropic keep charging your card at their published rates. HiWay applies a markup % on your monthly API spend (9-12.5% on Scale by volume, automatically decreasing).
Model names: pin or auto: pass model: "auto" for smart routing, or pin with model: "openai/gpt-4o" / model: "anthropic/claude-sonnet-4-5". HiWay's canonical ids prefix the provider - if you were using the bare provider name (e.g. gpt-4o), you'll want to update it or rely on auto.
Streaming is identical: pass stream: true, consume SSE chunks the same way. HiWay forwards byte-for-byte.
Tools / function calling are identical: the OpenAI tools / tool_choice fields work unchanged, including against Anthropic models (HiWay translates to Anthropic's tool format transparently).
Error codes are identical: 401 for auth, 429 for rate limit, 5xx for upstream failures. HiWay adds two codes of its own - 402 for Budget Control blocks and a guardian_block error body when Guardian trips.

You're done

Should take less than 10 minutes end-to-end - often closer to 5 since you don't have to set up a second provider account. Ping [email protected] if you're stuck.