Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
Compiling LLMs into a MegaKernel: A path to low-latency inference
zhihaojia.medium.com
314 points
matt_d
a year ago
76 comments
Loading...
Compiling LLMs into a MegaKernel: A path to low-latency inference | Heykuki News