_If_ 3.5 is a MoE model, doesn't that give a lot of hope to open source movements? Once a good open source MoE model comes out, maybe even some type of variation of the decoder models available(I don't know whether MoE models have to be trained from scratch), that implies a lot more can be done with a lot less.

It would be bad for single-consumer-GPU inference setups.

Could this work well with distributed solutions like petals?

https://github.com/bigscience-workshop/petals

I don't understand how petals can work though. I thought LLMs were typically quite monolithic.