> What workloads are so massively parallel that they can use 64 cores of x86 but can’t use the thousands of CUDA cores on a Quadro card?

Building software, for one. C compilers and python interpreters don't run on a GPU. Lots of stuff doesn't run on a GPU. In fact in practice the only things that run on a GPU are the tiny handful of known subproblems that the industry has collectively decided are "GPU problems".

Like it or not general purpose scalar software is, has always been, and will always remain the standard mechanism by which computing hardware is applied to new problems. Everything else is an optimization around the edges.

Interesting. Is there a compiler, that runs on GPU?

A really interesting exploration is Co-dfns [1]. This is indeed a compiler that runs on the GPU, but it's also extremely out of the mainstream.

IMHO this is an area ripe for more exploration. If I were working on it, I might look to linking first before compilation, because the basic link task is more similar to what GPUs are good at (advanced stuff such as LTO is a different story, though).

[1]: https://github.com/Co-dfns/Co-dfns