LLMs predict my coffee: Why not benchmark with physical experiments? | Heykuki News