OpenAI Python SDK

Drop-in: change one line, route through HiWay.

The OpenAI Python SDK is the most common entry point. HiWay implements the same wire protocol, so your existing code keeps working — you just override the base_url and use your hwy_ API key.

Minimal example

app.py
from openai import OpenAI

client = OpenAI(
    base_url="https://www.hiway2llm.com/v1",
    api_key="hwy_YOUR_KEY",
)

response = client.chat.completions.create(
    model="claude-haiku",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": "Summarise the French Revolution in 3 lines."},
    ],
    max_tokens=200,
)
print(response.choices[0].message.content)
print("Routed to:", response.model)
print("Latency (routing only):", response.headers.get("X-Hiway-Routing-Ms"), "ms")

Streaming

python
stream = client.chat.completions.create(
    model="claude-haiku",
    messages=[{"role": "user", "content": "Count to ten slowly"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Force a specific model

Use the extra_headers parameter to pin the request to a specific model and bypass routing.

python
response = client.chat.completions.create(
    model="claude-haiku",  # ignored when X-Force-Model is set
    messages=[{"role": "user", "content": "Critical query"}],
    extra_headers={"X-Force-Model": "anthropic/claude-opus-4-6"},
)

All chat-completions parameters are forwarded as-is — temperature, top_p, tools, tool_choice, response_format, max_tokens, stream, seed, etc.