Recently I've been working on making LLM evaluations fast by using bayesian optimization to select a sensible subset.
Bayesian optimization is used because it’s good for exploration / exploitation of expensive black box (paraphrase, LLM).
I would love to hear your thoughts and suggestions on this!