Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines (github.com/kvcache-ai)
20 points
sssummer
2 years ago
3 comments
2.
Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill (github.com/kvcache-ai)
14 points
sssummer
a year ago
discuss
3.
Mooncake: A KVCache-Centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai)
13 points
zinccat
2 years ago
discuss
4.
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving (github.com/kvcache-ai)
8 points
sarkory
a year ago
discuss