I’ve gotten confused between the different whispers. How is this different from the openai api endpoint?

It runs locally, using Whisper.cpp[1], a Whisper implementation optimized to run on CPU, especially Apple Silicon.

Whisper itself is open source, and so is that implementation, the OpenAI endpoint is merely a convenience to those who don't wish to host a Whisper server themselves, deal with batching, renting GPUs etc. If you're making a commercial service based on Whisper, the API might be worth it for the convenience, but if you're running it personally and have a good enough machine (an M1 MacBook Air will do), running it locally is usually better.

[1] https://github.com/ggerganov/whisper.cpp