What does HackerNews think of metaseq?

Repo for external large-scale work

Language: Python

Several groups already have. Facebook's OPT-175B is available to basically anyone with a .edu address (models up to 66B are freely available) and Bloom-176B is 100% open:

https://github.com/facebookresearch/metaseq

https://huggingface.co/bigscience/bloom

I really enjoyed this interview [1] with Emad Mostaque who I gather is probably the key funding and organization player in that stuff. It remains to be seen exactly how "open" it winds up playing out over time, but they're talking a compelling game and the EleutherAI people seem to be pretty heavily involved, which you probably wouldn't do if you weren't serious.

Yandex has also put up a ~100B language model [2]. My old colleagues at Meta have also started nibbling around the edges of opening some of this stuff up [3]. The Meta folks still aren't just handing out the big ones, but they're definitely moving the ball forward. In particular their release of the training logs is a really positive development IMHO as it opens the curtains a bit on the reality of training these things: it's a difficult, error/failure-prone process, there's a lot of trial-and-error, restarting from checkpoints, etc.

Anything that puts downward pressure on the magical thinking is A-OK in my book. The reality of this stuff is exciting/impressive enough: there's no need to embellish or exaggerate.

[1] https://www.youtube.com/watch?v=YQ2QtKcK2dA [2] https://github.com/yandex/YaLM-100B [3] https://github.com/facebookresearch/metaseq

"We are releasing all of our models between 125M and 30B parameters, and will provide full research access to OPT-175B upon request. Access will be granted to academic researchers; those affiliated with organizations in government, civil society, and academia; and those in industry research laboratories."

GPT-3 Davinci ("the" GPT-3) is 175B.

The repository will be open "First thing in AM" (https://twitter.com/stephenroller/status/1521302841276645376):

https://github.com/facebookresearch/metaseq/