Microsoft releases code to enable training BERT at large scale on commodity GPUs | Heykuki News