Heykuki News

TopNewBestAskShowJobs
TopNewBestAskShowJobs
1.
Bringing Up DeepSeek-V4-Flash on AMD MI300X (fergusfinn.com)
117 points
kkm
a day ago
22 comments
2.
Also-RANS: Asymmetric Numeral Systems for Entropy Coding (fergusfinn.com)
25 points
mezark
a month ago
discuss
3.
Redundant Information in LLM Weights (fergusfinn.com)
5 points
mezark
a month ago
discuss
4.
70x faster cold(ish) starts for SGLang (fergusfinn.com)
4 points
mezark
a month ago
discuss
5.
Tans: Precomputing RANS (fergusfinn.com)
3 points
mezark
a month ago
discuss
6.
Pushing memory bound CUDA kernels past the speed of light with data compression (fergusfinn.com)
2 points
somnial
6 days ago
discuss
7.
Speculative KV coding: ~4× losslessly compressed KV cache using a small model (fergusfinn.com)
2 points
somnial
22 days ago
discuss
8.
How fast can an LLM go? (fergusfinn.com)
2 points
kkm
7 months ago
discuss
9.
How fast can an LLM go? (fergusfinn.com)
2 points
gmays
7 months ago
discuss
10.
How fast can an LLM go? (fergusfinn.com)
2 points
somnial
7 months ago
discuss
11.
70x faster cold(ish) starts for SGLang (fergusfinn.com)
1 point
kkm
9 hours ago
discuss
12.
In search of wasted bits: how much information do LLM weights carry? (fergusfinn.com)
1 point
gmays
25 days ago
discuss
13.
70x faster cold(ish) starts for SGLang (fergusfinn.com)
1 point
somnial
a month ago
discuss
14.
Parallel Primitives for Multi-Agent Workflows (fergusfinn.com)
1 point
mezark
5 months ago
discuss
15.
LLM powered data structures: A lock-free binary search tree (fergusfinn.com)
1 point
somnial
5 months ago
discuss
16.
Parallel Primitives for Multi-Agent Workflows (fergusfinn.com)
1 point
somnial
5 months ago
discuss
17.
Scheduling in LLM Inference (fergusfinn.com)
1 point
somnial
7 months ago
discuss