Show HN: Efficient LLM Architectures for 32GB RAM (Ternary and Sparse Inference) | Heykuki News