One thing led to another and I ended up writing Busbar: An LLM gateway, written in Rust (I have a thing for Rust lately). You point your existing OpenAI/Anthropic/Gemini SDK at it, change the model to a pool name, and that name now load-balances across the vendors. Your client code doesn't change and never learns it even happened.
My central idea is "protocols, not providers". I implement six protocols - Anthropic, OpenAI, Gemini, Bedrock, Responses, Cohere - losslessly. You define a provider in three lines of YAML, mainly specifying the protocol that provider speaks.
Your client speaks a protocol in to Busbar and Busbar speaks a protocol out to the provider.
- Each protocol translates request and response, streamed and buffered, in both directions. Same-protocol calls pass through untouched; cross-protocol calls reconcile the awkwardness (a field one dialect requires and another makes optional).
- A circuit breaker that knows whose fault a failure is. It stops routing to a backend that's genuinely failing, but it won't penalize a model for a request that was simply too big (it retries on a larger-context model instead), and it won't blame a backend when the caller sent a bad request. A healthy model never gets pulled from rotation for something that wasn't its fault. All issues I have personally faced and wanted to fix one time in busbar vs 10x in 10 applications.
- Hand-rolled AWS implementations so I am not reliant on AWS SDK's: SigV4 and a from-scratch AWS eventstream frame decoder for Bedrock
It's 1.0.0-rc.2 — feature-complete and API-stable, with release-candidate validation underway before 1.0.0. I have been using it on my projects and its solving my problems nicely.
Solo project, AGPL-3.0. The AGPL choice is open to discussion; I know it matters for a request-path component.
Feedback very welcome, particularly on where the translation might still be lossy in edge cases. Contribution and conversation desired!