Dramatically cut down on token usage and speed up response times by up to 130x. A blend of Python and Go leverages the FAISS library for semantic similarity detection, ensuring cache hits are as accurate as they are fast. Easy to run with docker!Still very much a work in progress... would appreciate feedback and/or contributors!