People complain that the magic of programming has been lost because all we do is stitch together APIs. They're oversimplying the work done, those people are cynical, this is amazing.
I think part of the magic is still lost, as we used to be very open source and free centric. These days it's all based on paid API calls.
The magic is still there! In this case, the model for OpenAI's Whisper, which is arguably doing the bulk of the work here, is Open Source (under the MIT licence), and freely available for download at https://github.com/openai/whisper. You can run it wherever you want, though something with a GPU will let you do 5x realtime (or better!) transcription.