Now that it runs on our laptops a lot of the initial shock, mystique, and novelty has already worn off, just like with DALL-E before it.

What the leaks show is that the community can run circles around OpenAI and the others when it comes to optimizing these things and taking them in interesting directions on a shoe string budget.

For example of what one sesh of tinkering can produce:

https://github.com/geohot/tinygrad/blob/master/examples/llam...

https://www.youtube.com/watch?v=nctqc8FBJ2U&t=4954s

The cat is out of the bag, you don't need a beowulf cluster of A100's and unlimited Azure credits. You do need a budget to train that is out of reach for hobbyists but the moat is not insurmountable and I don't think VC's will be scared off anymore as a lot of startup ideas are viable and a lot of outsiders can demonstrably make a real go of it.

Time for BasedAI. Game on.

Yeah, the community always seems to figure out how to do things more effective.

My girlfriend asked me if I could transcribe some audio files for her with my "programming stuff". I immediately thought of Whisper from OpenAI.

I first used the official CLI tool. With the largest model it took long 8 hours to transcribe a 30 min long file. I noticed it was running on the CPU - tried switching it to use the GPU instead with no luck. Running it on WSL was probably not helping.

Then I found this gem: https://github.com/Const-me/Whisper A C++ Windows implementation of Whisper. I opened the program, fed it with the largest model and the file. The transcript was done in 4 minutes, instead of 8 hours... Downside? The program has a GUI, lol.

Of course, I could probably get the CLI tool to run on the GPU with some tinkering and installing some Nvidia packages for Whisper to use. But frankly, I have so little experience with that kind of stuff, that installing the Windows implementation was a much easier choice.