Hey all! A friend and I have been building projects with open-source LLMs for a while now (originally for other project ideas) and found that quickly iterating with different fine-tuning datasets is super hard. Training a model, setting up some inference code to try out the model and then going back and forth took 90% of our time.
That’s why we built Haven, a service to quickly try out different fine-tuning datasets and base-models. Going from uploading a dataset to chatting with the resulting model now takes less than 5 minutes (using a reasonably sized dataset).
We fine-tune the models using low-rank adapters, which not only means that the changes made to the model are very small (only 30mb for a 7b parameter LLM), it also allows us to host many fine-tuned models very efficiently by hot swapping adapters on demand. This helped us reduce cold-start times to below one second and makes it possible for us to host a single trained model for a few dollars per month. [Research has shown](https://arxiv.org/pdf/2305.14314.pdf) that low-rank fine-tuning performance stays almost on-par with full fine-tuning.
We charge between $0.004/1k training tokens, and after signing up, you get $5 in free credits. You can export all the models to Huggingface.
Right now we support Llama-2 and Zephyr (which is itself a fine-tune of Mistral) as base models. We’re gonna add some more soon. We hope you find this useful and we would love your feedback!