What does HackerNews think of OpenMoE?

A family of open-sourced Mixture-of-Experts (MoE) Large Language Models

Language: Python

https://github.com/XueFuzhao/OpenMoE

Check out this open source Mixture of Experts research. Could help a lot with performance of open source models.

I think the weird thing about this is that it's completely true right now but in X months it may be totally outdated advice.

For example, efforts like OpenMOE https://github.com/XueFuzhao/OpenMoE or similar will probably eventually lead to very competitive performance and cost-effectiveness for open source models. At least in terms of competing with GPT-3.5 for many applications.

Also see https://laion.ai/

I also believe that within say 1-3 years there will be a different type of training approach that does not require such large datasets or manual human feedback.

Google have released the models and code for the Switch Transformer from Fedus et al. (2021) under the Apache 2.0 licence. [0]

There's also OpenMoE - an open-source effort to train a mixture of experts model. Currently they've released a model with 8 billion parameters. [1]

[0] https://github.com/google-research/t5x/blob/main/docs/models...

[1] https://github.com/XueFuzhao/OpenMoE

It makes a lot of sense! In fact there's a number of open source projects working on just such a model right now. Here's a great example: https://github.com/XueFuzhao/OpenMoE/