This is one of my fun side projects from last weekend. It searches for information about a given image, using the latest Gemini Vision model and the Llava model as a fallback.
Please note that it may generate incorrect answers sometimes.
I created this project just to play around with the Gemini model.
You can find the source code here: https://github.com/n4ze3m/vexasearch
Also, it takes time to generate a response depending on the image size.
Please give valuable feedback. Thanks!