Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Login
Top
New
Best
Ask
Show
Jobs
MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks
ai.googleblog.com
72 points
mfiguiere
3 years ago
33 comments
Loading...
MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks | Heykuki News