Managed to make Mistral NeMO 12b https://mistral.ai/news/mistral-nemo/ fit in a free Google Colab with a Tesla T4 GPU (16GB) for 4bit QLoRA finetuning!Managed to shave 60% VRAM usage and made it 2x faster as well! It should work in under 12GB of VRAM as well!