Why do LLM outputs get worse even when metrics stay stable? [pdf] | Heykuki News