VibeVoice-ASR: speech-to-text model designed to handle 60-minute long-form audio | Heykuki News