Asking because I’m lazy: if I need to transcribe audio in real-time, is there a state of the art model I can plug into?
https://github.com/ggerganov/whisper.cpp

https://github.com/Const-me/Whisper

I had fun with both of these. They will both do realtime transcription. Bit you will have to download the training data sets…