So just as people go hunting for steel that was made pre-1940s to avoid background radiation caused by nuclear tests and detonations, so today's Twitter corpus could be a treasure house of non-MLL generated text data.
https://en.wikipedia.org/wiki/Low-background_steel