MV-GPT: A New Generative Pre-Training Framework for Multimodal Video Captioning | Heykuki News