It preprocesses your tweets, generates embeddings using OpenAI's small/large embedding model, stores the data and embeddings in LanceDB vector db, and provides a web interface to search and view the results.
You can do semantic search post pre-filtering by time, likes, retweets, media only or link only tweets too.
Pre-filtering by sql operations helps not only filter but also reduce the vector search space thus speeding up the search.
It also supports semantic search over images with help of open-clip embedding.
Background: Twitter's search does not work well especially for older tweets. Sometimes, I don't know the exact keywords to search so I built an embedding search over tweets.
I also post a lot of memes and images so I added an option for creating a image search engine - you can search by text or image. The results will lead you back to the original tweet.