Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines
(github.com/kvcache-ai)
20 points
sssummer
2 years ago
3 comments
2.
▲
Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill
(github.com/kvcache-ai)
14 points
sssummer
a year ago
discuss
3.
▲
Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving
(github.com/kvcache-ai)
13 points
zinccat
2 years ago
discuss
4.
▲
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving
(github.com/kvcache-ai)
8 points
sarkory
a year ago
discuss