Serving 70B-scale LLMs efficiently on low-resource edge devices [pdf] | Heykuki News