Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
vLLM-mlx – 65 tok/s LLM inference on Mac with tool calling and prompt caching
github.com/raullenchai
3 points
raullen
3 months ago
1 comment
Loading...
vLLM-mlx – 65 tok/s LLM inference on Mac with tool calling and prompt caching | Heykuki News