I submitted an earlier version of this a few months ago (as llama2.f90). At that time it had a lot of steps to run and was just a toy, now it's easy to run and is a competitive option for llm inference. See the motivation section for discussion and the `Performance` issue for an ongoing discussion about performance.
2 comments
Show HN: Llm.f90 fast, hackable transformer implementation in Fortran | Heykuki News