Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
DuoAttention-Slashes memory and latency for LLMs without sacrificing performance
github.com/mit-han-lab
2 points
dsr12
2 years ago
No comment yet
DuoAttention-Slashes memory and latency for LLMs without sacrificing performance | Heykuki News