Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
Bringing Up DeepSeek-V4-Flash on AMD MI300X
(fergusfinn.com)
117 points
kkm
a day ago
22 comments
2.
▲
Also-RANS: Asymmetric Numeral Systems for Entropy Coding
(fergusfinn.com)
25 points
mezark
a month ago
discuss
3.
▲
Redundant Information in LLM Weights
(fergusfinn.com)
5 points
mezark
a month ago
discuss
4.
▲
70x faster cold(ish) starts for SGLang
(fergusfinn.com)
4 points
mezark
a month ago
discuss
5.
▲
Tans: Precomputing RANS
(fergusfinn.com)
3 points
mezark
a month ago
discuss
6.
▲
Pushing memory bound CUDA kernels past the speed of light with data compression
(fergusfinn.com)
2 points
somnial
6 days ago
discuss
7.
▲
Speculative KV coding: ~4× losslessly compressed KV cache using a small model
(fergusfinn.com)
2 points
somnial
22 days ago
discuss
8.
▲
How fast can an LLM go?
(fergusfinn.com)
2 points
kkm
7 months ago
discuss
9.
▲
How fast can an LLM go?
(fergusfinn.com)
2 points
gmays
7 months ago
discuss
10.
▲
How fast can an LLM go?
(fergusfinn.com)
2 points
somnial
7 months ago
discuss
11.
▲
70x faster cold(ish) starts for SGLang
(fergusfinn.com)
1 point
kkm
9 hours ago
discuss
12.
▲
In search of wasted bits: how much information do LLM weights carry?
(fergusfinn.com)
1 point
gmays
25 days ago
discuss
13.
▲
70x faster cold(ish) starts for SGLang
(fergusfinn.com)
1 point
somnial
a month ago
discuss
14.
▲
Parallel Primitives for Multi-Agent Workflows
(fergusfinn.com)
1 point
mezark
5 months ago
discuss
15.
▲
LLM powered data structures: A lock-free binary search tree
(fergusfinn.com)
1 point
somnial
5 months ago
discuss
16.
▲
Parallel Primitives for Multi-Agent Workflows
(fergusfinn.com)
1 point
somnial
5 months ago
discuss
17.
▲
Scheduling in LLM Inference
(fergusfinn.com)
1 point
somnial
7 months ago
discuss