Show HN: Beyond-NanoGPT

2 points

a year ago

Hi all, my first time posting here :)

I spent the last few weeks writing a repo that aims to help people go from nanoGPT-level LLM basics to closer to the modern research deep learning frontier.

I'm open sourcing it: it contains thousands of lines of annotated, from-scratch pytorch implementing everything from speculative decoding to vision/diffusion transformers to linear and sparse attention, and much more.

I'd love to hear feedback on fundamental research papers from the last few years I missed and should definitely implement, or feedback of any other kind. It's my first time posting to HN and I've seen so many cool projects here over the years :)