It’s an interesting situation because on one level, the technology is at least somewhat misrepresented (best case results rising to the top of everyone’s feeds), but on the other side, we haven’t even begun to see what integrated Chat GPT will bring. If I could replace Siri with Chat GPT, I would do so immediately, as it’s objectively better. The author makes some points about the eye watering costs; however, if this is something that people truly want—and I think we do want it—the market will find a way to deliver it at scale in a cost effective way.

Do we have an idea of what the costs to run ChatGPT might be? Are we costing OpenAI $5 everytime we ask it to generate python in the style of Donald Trump?

Probably training it on specific data sets and contexts is the expensive part.

Running a gigantic model across multiple GPUs isn't cheap either

I’ve asked ChatGPT about hardware. Here’s the response:

> As for your specific computer, with 64 GB of RAM and a high-performance GPU like the GeForce 1080 Ti, it should have sufficient resources to run a language model like me for many common tasks.

Based on the models open sourced by OpenAI, they are using PyTorch and CUDA. This means their stack requires nVidia GPUs. I think the main reason for their high costs is a single sentence in the EULA of GeForce drivers: https://www.datacenterdynamics.com/en/news/nvidia-updates-ge...

It’s technically possible to port their GPGPU code from CUDA somewhere else. Here’s a vendor-agnostic DirectCompute re-implementation of their Whisper model: https://github.com/Const-me/Whisper

On servers, DirectCompute is not great ‘coz Windows server licenses are expensive. Still, I did that port alone, and spent couple weeks doing that.

OpenAI probably has resources to port their inference to vendor-agnostic Vulkan Compute, running on Linux servers equipped with reasonably-priced AMD or Intel GPUs. For instance, Intel A770 16GB only costs $350, but delivers similar performance to nVidia A30 which costs $16000. Intel consumes more electricity but not by much, 225W versus 165W. That’s like 40x difference in cost efficiency of that chat.