Hm, that's a heft price, 200K per card.

Mac Studio M2 ULTRA has 192Gb of RAM, potentially 188Gb available for GPU, for 5K.

Wouldn't apple be able to compete with that if they scaled it up?

Go ahead and ask datacenters why they all use overpriced Nvidia chips for AI training instead of shoving cheaper Mac Studios in there. Their answer might blow your mind.

Spoiler alert: CUDA ecosystem, Linux suport, and most importantly for data centers, Mellanox high speed interfacing with virtually infinite scalability and great virtualization support so they can rent out slices of their HW to customers in exchange for money.

A 15 year head start in a category they essentially defined plus an entire generation of executives, developers, and users doesn’t hurt either.

People complain about the “Nvidia tax” but the hardware is superior (untouchable at datacenter scale) and the “tax” turns into a dividend as soon as your (very expensive) team spends hour after hour (week after week) dealing with issues on other platforms compared to anything based on CUDA often being a Docker pull away with absolutely first class support on any ML framework.

Nvidia gets a lot of shade on HN and elsewhere but if you’ve spent any time in this field you completely understand why they have 80-90% market share of GPGPU. With Willow[0] and the Willow Inference Server[1] I'm often asked by users with no experience in the space why we don't target AMD, Coral TPUs (don't even get me started), etc. It's almost impossible to understand "why CUDA" unless you've fought these battles and spent time with "alternatives".

I’ve been active in the space for roughly half a decade and when I look back to my early days I’m amazed what a beginner like me was able to do because of CUDA. I still routinely am. What you’re able to actually accomplish with a $1000 Nvidia card and a few lines with transformers and/or a Docker container is incredible.

That said I am really looking forward to Apple stepping it up here - I’ve given up on AMD ever getting it together on GPGPU and Intel (with Arc) is even further behind. The space needs some real competition somewhere.

[0] - https://github.com/toverainc/willow

[1] - https://github.com/toverainc/willow-inference-server