Running the app: Everything you need to know to set WhisperWriter up is in the linked GitHub repo's README file. The app can be run locally using the open-source Whisper release or online through the OpenAI API. I personally use the latter because my computer is not very powerful, but the former is free and doesn't require an OpenAI account.
The inspiration: I have a disability that makes it painful to type on a keyboard. As a software engineer, this has a considerable impact on my daily life. I've tried using the built-in Windows dictation tools, but they are very buggy and extremely frustrating to use. I was actually getting pretty close to quitting tech as a whole because of all the difficulties I would run into when trying to code. But I recently started playing around with AI coding assistants – and they've been a complete game-changer for me.
Last month, I made a small web app for fun, entirely coded by GPT (<https://github.com/savbell/playlist-gpt>). Since AI wrote most of the code, it required minimal keyboard use on my end – but I still needed to type up the prompts, which was causing me significant pain. I'd heard about OpenAI recently releasing a speech recognition model called Whisper, but nobody had written a dictation app using it yet. So I decided to continue to test out GPT's programming abilities and used it to write my own speech-to-text app that now vastly outperforms the built-in Windows ones! (At least, for my specific use cases.) I now use WhisperWriter every day for almost anything that I need to type, such as prompting ChatGPT – and even writing out this post!
The process: Using ChatGPT, it only took about two hours to build a fully-functioning command-line app that did exactly what I wanted it to do. I spent a few more hours adding a pop-up status window and some more configuration options, but really the core app was completed in less than an afternoon. I even used the prototype to dictate my prompts to ChatGPT to add all the additional features!
I'm considering writing an article with more details about the exact prompts I used and the lessons I learned along the way, à la this post from two days ago: <https://news.ycombinator.com/item?id=35839536>. Since it would be a lot of typing/dictating, I'm on the fence about it, but please let me know in the comments if it would be interesting! I've used ChatGPT to built six small apps in total so I have a lot of insights to share.
Why I'm sharing: Three main reasons:
1. WhisperWriter is small, hasn't been well-tested, and is tailored to my own personal use case. However, it's been completely life-changing. I wanted to share it because if there's even a chance that someone else who is struggling with similar issues might see it and use it too, I want to take that!
2. I wanted to share a concrete example of how AI can be used to improve people's quality-of-life in small ways. There is so much talk about the bigger-picture impacts of AI, but I wanted to showcase how technologies like AI coding assistants have opened up endless possibilities for people to change their own lives in small ways, such as by building apps tailored to their own specific day-to-day problems.
3. I think Whisper is super cool and I want to show off some of its capabilities in hopes that it encourages others to play around with it, too. I'd love to see someone expand upon my work or speech-to-text tech in general, since any improvements in that area can increase accessibility for not just me, but many other people who may be facing similar disabilities or challenges.
Thanks for reading everyone! I'm looking forward to hearing your feedback and discussing this more.