Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
FlexGen: Running large language models on a single GPU
github.com/FMInference
192 points
behnamoh
3 years ago
43 comments
Loading...
FlexGen: Running large language models on a single GPU | Heykuki News