When I wanted to build a RAG app in Go end of last year, I was surprised about the lack of options for a simple, embeddable DB.
- Pinecone is proprietary / hosted ony, Qdrant and Milvus not embeddable, Chroma only embeddable in Python and Weaviate only embeddable in Python or TS/JS.
- SQLite with the sqlite-vss extension and DuckDB can only be used from Go with CGO. Same for the C++ libraries Faiss, Annoy and USearch.
- Some of the LangChain-style Go libraries have a rudimentary implementation, but either lacking persistence, aren't performance optimized or just generally bring a ton of dependencies due to their nature of being an orchestration library and trying to support all kinds of LLMs and third party vector stores etc.
This lead me to create chromem-go, which initially took inspiration from the simple Chroma interface, but now deviating more from it:
- Embeddable in Go. No need for running a separate server. (A.k.a. in-process)
- Zero dependencies on third party libraries
- Multi-threaded processing for adding and querying documents. After some performance optimizations in v0.5.0, a query on 100.000 documents runs in 40 ms on a 1st gen Framework Laptop (see benchmarks)
- The DB can optionally create embeddings and offers convenience implementations for common providers (OpenAI, Cohere, Mistral, Jina, mixedbread, but also local Ollama and LocalAI)
- Similarity search is exhaustive via cosine similarity. As mentioned above that's quick enough for 100k documents and probably more. HNSW or IVFFlat start to make sense when scaling to millions of documents, where you likely go with a client-server solution anyway.
- Optional persistence, either per document, or a DB export/import with optional gzip compression and AES-GCM encryption
The repo contains examples for semantic search over arXiv abstracts, as well as a simple RAG app for Wikipedia entries using the models "nomic-embed-text" and "gemma:2b" locally in Ollama [1].
I'm using the project at work for demoing a simple locally running semantic search over our company Jira and hopefully more soon.
I think and hope it can be useful for some of you as well. Like the LLocalSearch "Perplexity clone" that was posted two days ago [2], which has a Go backend and probably doesn't need a Chroma instance running (as embeddings are created on the fly for search results and not in advance on a million documents).
Shoutout to HN user @eliben whose blog post [3] motivated me to create the project.
[1] https://github.com/philippgille/chromem-go/tree/v0.5.0/examp...
[2] https://news.ycombinator.com/item?id=39923404
[3] https://eli.thegreenplace.net/2023/retrieval-augmented-gener...