There are some large AWS customers that probably burn that in idle time on a bunch of unused machines per week (probably day).

Can the training be parallelized in a manner similar to SETI-at-home?

Yes, hivemind trained a gpt 6B model like this.

General model training https://github.com/learning-at-home/hivemind

Stable diffusion specific https://github.com/chavinlo/distributed-diffusion

Inference only stable diffusion https://stablehorde.net/