Show HN: I built a 2nd-order PyTorch optimizer for LLMs that runs on 16GB GPUs | Heykuki News