I've been experimenting with the recognize anything models from https://github.com/xinyu1205/recognize-anything and ended up writing a simple HTTP API around them using FastAPI, packaged into an offline-capable docker image on Dockerhub (https://hub.docker.com/r/mnahkies/recognize-anything-api).This made it convenient to run inference across all my photos, and create a web app to search / browse them by tag/content (I plan to tidy this up and release it too).