Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
Show HN: NanoSLG – Hack Your Own Multi-GPU LLM Server (5x Faster, Educational)
github.com/Guney-olu
1 point
geniusyan
4 months ago
I built NanoSLG as a minimal, educational inference server for LLMs like Llama-3.1-8B. It supports Pipeline Parallelism (split layers across GPUs), Tensor Parallelism (shard weights), and Hybrid modes for scaling.
No comment yet