I wanted a voice assistant that feels realtime but runs completely offline.
This prototype uses MLX + FastAPI on Apple Silicon to hit sub-second latency for speech-to-speech conversations.Repo: https://github.com/shubhdotai/offline-voice-ai
It’s fast, minimal, and hackable — would love feedback on latency tricks, model swaps, or use-cases you’d like to see next.