Back
RouteLLM
Stable routing, fallback and quota control for multiple LLM providers
RouteLLM is a routing layer that distributes requests across multiple LLM providers based on availability, cost, and latency.
It automatically switches providers when one is down or slow, balances load, tracks usage, and enforces quota limits across providers.
Problem
Managing multiple LLM providers requires handling different APIs, rate limits, and failover logic. This complexity grows with each provider added.
When to use
Use RouteLLM when you need to call multiple LLM providers (OpenAI, Anthropic, etc.) and want automatic failover, cost optimization, quota control, and unified API access. It eliminates the need to manage provider-specific code and handles rate limits automatically.
Input / Output
Input:Chat completion or text completion request
Output:Response from selected provider with automatic fallback on failure
Example
cURL example
curl -X POST https://routellm.kikuai.dev/api/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"messages": [{"role": "user", "content": "Hello"}],
"model": "gpt-4o-mini"
}'JavaScript example
const response = await fetch('https://routellm.kikuai.dev/api/chat', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer YOUR_API_KEY'
},
body: JSON.stringify({
messages: [{ role: 'user', content: 'Hello' }],
model: 'gpt-4o-mini'
})
})
const data = await response.json()