Streaming responses
How HiWay forwards Server-Sent Events end-to-end.
HiWay supports the full SSE streaming protocol natively. Set stream: true in your request and you'll get the standard data: {...}\n\n chunks, one token at a time, exactly as if you were talking to the provider directly.
Latency impact
First-token latency is provider latency + ~5 ms of routing. We don't buffer the stream, don't rewrite chunks, don't add a JSON wrapper. Your client sees the same SSE events the provider would have sent.
Tool calls in streams
Tool/function call deltas stream through unchanged. Whatever the provider does with tool_calls inside a streaming chunk, HiWay forwards it as-is - your OpenAI-compatible client parses them without any adapter.
Client-side example
from openai import OpenAI
client = OpenAI(base_url="https://app.hiway2llm.com/v1", api_key="hw_live_YOUR_KEY")
stream = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Write a haiku about routers"}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)