I like to think that I understand LLMs pretty well. Which is why I was so underwhelmed by most of the mainstream "AI" news. But this threw for a loop. As a predictor, how can it model base64? It surely can't just be "pretending" like it does with all other stuff. The precision feels the most wrong to me, it does long random strings perfectly. Why does it then fail at simple arithmetic?