Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
Llama.cpp speculative sampling: 2x faster inference for large models
github.com/ggerganov
4 points
bobivl
3 years ago
1 comment
Loading...
Llama.cpp speculative sampling: 2x faster inference for large models | Heykuki News