On-device acceleration of large diffusion models via GPU-aware optimizations | Heykuki News