Every model learned by gradient descent is approximately a kernel machine (2020) | Heykuki News