Multimodal quickstart
Generate your first image, audio, or embedding in under 5 minutes.
Prerequisite
Your workspace must have multimodal_enabled = TRUE (ask an admin). Add your provider API keys under Dashboard → API Keys before calling any /v2/* endpoint.
The multimodal module runs as a separate service on port 4010 (https://app.hiway2llm.com routes /v2/* to it transparently). It shares your JWT authentication and BYOK vault with the core LLM router.
1. Add a multimodal provider key
Go to Dashboard → API Keys → Add BYOK key and select a multimodal provider: fal (images + video), openai (images + audio + embeddings), elevenlabs (TTS), deepgram (STT), or stability (images).
Provider setup guides
Need help getting an API key? Each provider has a blog article with step-by-step instructions: [OpenAI](/blog/get-openai-api-key) · [fal.ai](/blog/get-fal-ai-api-key) · [ElevenLabs](/blog/get-elevenlabs-api-key) · [Stability AI](/blog/get-stability-ai-api-key) · [Together AI](/blog/get-together-ai-api-key) · [Replicate](/blog/get-replicate-api-key) · [Cohere](/blog/get-cohere-api-key) · [BFL / Flux](/blog/get-bfl-flux-api-key)
Two requirements for multimodal
1. Add a BYOK key for a supported multimodal provider in Dashboard → API Keys (fal.ai, Stability AI, ElevenLabs, OpenAI, Cohere, etc.). 2. Make sure multimodal is enabled on your workspace: Settings → Multimodal → Enable. Without both, /v2/* endpoints return 403.
2. Generate an image
curl https://app.hiway2llm.com/v2/image/generate \
-H "Authorization: Bearer hw_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"prompt": "A cinematic shot of a futuristic city at night",
"provider": "fal",
"model": "fal-ai/flux/schnell",
"aspect": "landscape",
"seed": 42
}'3. Text-to-speech
curl https://app.hiway2llm.com/v2/audio/tts \
-H "Authorization: Bearer hw_live_YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"text": "Hello from HiWay2LLM!", "provider": "openai", "voice": "nova"}' | jq .audio_file_id
# Then stream the audio:
curl https://app.hiway2llm.com/v2/audio/file/FILE_ID \
-H "Authorization: Bearer hw_live_YOUR_KEY" --output speech.mp34. Embeddings
resp = httpx.post(
"https://app.hiway2llm.com/v2/embed",
headers={"Authorization": "Bearer hw_live_YOUR_KEY"},
json={
"input": ["Hello world", "Second sentence"],
"provider": "openai",
"model": "text-embedding-3-small",
},
)
data = resp.json()
print(data["data"][0]["embedding"][:5]) # first 5 dims
print("cached:", data["cached"])