Show HN: BenchFlow – Open-Source Benchmark Hub and Eval Infra for AI Devs

docs.benchflow.ai

1 point

a year ago

We just open-sourced [*BenchFlow*](https://github.com/benchflow-ai/benchflow) — an evaluation infrastructure and benchmark hub for AI developers and researchers. Whether you're building or running benchmarks, BenchFlow helps you do it fast and reproducibly.

- Easy-to-use `BaseAgent`/`BaseBench` interfaces - Run tasks across agents & models with one command - Add your own benchmarks and agents - Collect results, compare runs, and iterate faster - Docker-ready benchmark deployment - Public Benchmark Hub: [benchflow.ai](https://benchflow.ai)

We support OpenAI, HuggingFace, local models, and more. Would love feedback or contributions!

GitHub: https://github.com/benchflow-ai/benchflow Docs: https://docs.benchflow.ai/introduction Discord: https://discord.gg/mZ9Rc8q8W3