Show HN: gpt-cache – A OSS proxy cache to save on GPT tokens and response times

Heykuki News

1 point

2 years ago

Dramatically cut down on token usage and speed up response times by up to 130x. A blend of Python and Go leverages the FAISS library for semantic similarity detection, ensuring cache hits are as accurate as they are fast. Easy to run with docker!

Still very much a work in progress... would appreciate feedback and/or contributors!

Show HN: gpt-cache – A OSS proxy cache to save on GPT tokens and response times | Heykuki News