Yandex has also put up a ~100B language model [2]. My old colleagues at Meta have also started nibbling around the edges of opening some of this stuff up [3]. The Meta folks still aren't just handing out the big ones, but they're definitely moving the ball forward. In particular their release of the training logs is a really positive development IMHO as it opens the curtains a bit on the reality of training these things: it's a difficult, error/failure-prone process, there's a lot of trial-and-error, restarting from checkpoints, etc.
Anything that puts downward pressure on the magical thinking is A-OK in my book. The reality of this stuff is exciting/impressive enough: there's no need to embellish or exaggerate.
[1] https://www.youtube.com/watch?v=YQ2QtKcK2dA [2] https://github.com/yandex/YaLM-100B [3] https://github.com/facebookresearch/metaseq
https://github.com/facebookresearch/metaseq
Logbook links in specific: https://github.com/facebookresearch/metaseq/blob/main/projec...
GPT-3 Davinci ("the" GPT-3) is 175B.
The repository will be open "First thing in AM" (https://twitter.com/stephenroller/status/1521302841276645376):