For those on Linux, I've been working on a Talon inspired voice coding program called Osprey that uses the Google Cloud speech to text API: https://github.com/osprey-voice/osprey.
It's still very much a work in progress but it's already been working very well for me and I'm actually using it to type out this response right now.
Why would you work with google when there are much more accurate open source speech recognizers based on Kaldi? With that specific usecase it is very easy to beat Google on accuracy.
Actually I don't think I ended up testing kaldi because it seemed difficult to set up but I'll give it a try now that you mention it.
Ok, if you want to start with Kaldi it is probably easier to check kaldi-active-grammar mention above or https://github.com/alphacep/vosk-api