Llama.cpp speculative sampling: 2x faster inference for large models | Heykuki News