It's much harder to get excited about one of the best ASR implementations I've encountered if using it means getting results for 10 seconds of audio takes 3 minutes (or longer).
Additionally, sending audio off to some random cloud is kind of creepy to me so I'm using whisper-asr-webservice[0] to host my own on hardware with an RTX 3090. On this hardware Whisper ASR is anywhere from 5 -20x faster than realtime depending on various conditions.
So anyway, last night I decided to throw this horrendous shell project together[1] to more easily make use of my self-hosted instance.
A casual glance will show I'm not exactly an expert developer by trade... I'm hoping HN might be interested in working on this with me.
With that, I've disabled auth for my self-hosted Whisper instance for HN to play around with until it possibly gets beaten on so badly I have to re-enable auth:
BASE_URL=https://whisper.tovera.io ./asr.sh
(read the docs in the repo)In terms of me not storing your data for this (I don't) I guess you'll just have to trust me?
[0] - https://github.com/ahmetoner/whisper-asr-webservice
[1] - https://github.com/kristiankielhofner/whisper-asr-webservice-client