- Drop-in replacement for the OpenAI SDK (change one line: base_url) - Each query gets classified (regex fast path + lightweight LLM classifier) and matched against ~390 models - Three tiers (Frugal/Balanced/Premium) to control the quality-cost tradeoff - Automatic failover if a provider goes down - Cost metadata in every response
The routing logic is benchmark-driven (LMArena, Artificial Analysis), not ML-based — simpler to debug and reason about. The regex fast path handles ~60% of requests in under 5ms with zero API calls.
Example: a customer support bot doing 10K conversations/month went from ~$250/mo (everything pinned to Opus 4.6) to ~$40/mo with routing. Most conversations were FAQ-level questions that a smaller model handled fine.
Stack: Next.js, Vercel, Neon PostgreSQL, OpenRouter upstream. Hosting cost: ~$20/month.
We ran a head-to-head benchmark: same 15 prompts through Opus, GPT-4o, Gemini Pro, and the router. Simple tasks cost 66% less with routing. Complex tasks produced 2x more detailed output because the router picked specialized models per task type. Full data: https://dev.to/robinbanner/we-benchmarked-4-ai-api-strategie...
Architecture writeup: https://dev.to/robinbanner/inside-komilions-architecture-how... — there's a free tier if you want to try it.