Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
1.
▲
A new CUDA kernel for quantized LLMs achieves up to 2.6x latency improvements
(github.com/HanGuo97)
2 points
radichoml
2 years ago
1 comment
2.
▲
Show HN: Tokenusage – Rust CLI that tracks Claude Code/Codex tokens 214x faster
(github.com/hanbu97)
1 point
hanbu97
3 months ago
3 comments