- Easy-to-use `BaseAgent`/`BaseBench` interfaces - Run tasks across agents & models with one command - Add your own benchmarks and agents - Collect results, compare runs, and iterate faster - Docker-ready benchmark deployment - Public Benchmark Hub: [benchflow.ai](https://benchflow.ai)
We support OpenAI, HuggingFace, local models, and more. Would love feedback or contributions!
GitHub: https://github.com/benchflow-ai/benchflow Docs: https://docs.benchflow.ai/introduction Discord: https://discord.gg/mZ9Rc8q8W3