I wrote an article about a cloud GPU selection technique that would maximize FLOPs per $.
Taking the pre-training of LLMs as an example, it shows how the cost-optimal GPU changes depending on the computational intensity (∝ model size x batch size).
Show HN: The Poor Man's Guide to Cloud GPU Selection | Heykuki News