Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
Lossless LLM compression for efficient GPU inference via dynamic-length float
arxiv.org
411 points
CharlesW
a year ago
117 comments
Loading...
Lossless LLM compression for efficient GPU inference via dynamic-length float | Heykuki News