What does HackerNews think of gpt-neo?

An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

Language: Python

#4 in R
just happy I won a big blue gorilla at a carnival. https://twitter.com/theshawwn/status/1411519063432519680

Plus it's looking more and more like I'll be getting a job in finance with a fat salary. First interview's on monday. Tonight I felt "This is it -- if getting a few dozen people to sign up for TFRC is the only way I can make an impact, then at least I'll be ending my ML streak on a high note."

It's truly amazing to me that the world hasn't noticed how incredible TFRC is. It's literally the reason Eleuther exists at all. If that sounds ridiculous, remember that there was a time when Connor's TPU quota was the only reason everyone was able to band together and start building GPT neo. https://github.com/EleutherAI/gpt-neo

At least I was able to start a discord server that happened to get the original eleuther people together in the right place at the right time to decide to do any of that.

But the root of all of it is TFRC. Always has been. Without them, I would've given up ML long ago. Because trying to train anything on GPUs with Colab is just ... so frustrating. I would have fooled around a bit with ML, but I wouldn't have decided to pour two years of my life into mastering it. Why waste your time?

Five years from now, Jax + TPU VMs are going to wipe pytorch off the map. So I'll be making bank at a finance company, eating popcorn like "told ya so" and looking back wistfully at days like today.

Everyone in ML is so cool. Was easily the best two years of my life as a developer. I know all this is kind of weird to pour out, but I don't care -- everyone here owes everything to the geniuses that bequeathed TFRC unto the world.

For now, I slink back into the shadows, training tentacle porn GANs in secret, emerging only once in a blue moon to shock the world with weird ML things. Muahaha.

As someone who works on a Python library solely devoted to making AI text generation more accessible to the normal person (https://github.com/minimaxir/aitextgen ) I think the headline is misleading.

Although the article focuses on the release of GPT-Neo, even GPT-2 released in 2019 was good at generating text, it just spat out a lot of garbage requiring curation, which GPT-3/GPT-Neo still requires albeit with a better signal-to-noise ratio. Most GPT-3 demos on social media are survivorship bias. (in fact OpenAI's rules for the GPT-3 API strongly encourage curating such output)

GPT-Neo, meanwhile, is such a big model that it requires a bit of data engineering work to get operating and generating text (see the README: https://github.com/EleutherAI/gpt-neo ), and it's unclear currently if it's as good as GPT-3, even when comparing models apples-to-apples (i.e. the 2.7B GPT-Neo with the "ada" GPT-3 via OpenAI's API).

That said, Hugging Face is adding support for GPT-Neo to Transformers (https://github.com/huggingface/transformers/pull/10848 ) which will help make playing with the model easier, and I'll add support to aitextgen if it pans out.

There's a replication of GPT-3 freely available. The model is not as big, but the results are good for the size

https://github.com/EleutherAI/gpt-neo

https://twitter.com/BlancheMinerva/status/137399189661642752...

I tried the same prompt on the open-source recreation of OpenAI GPT-3 -- GPT Neo (https://github.com/EleutherAI/gpt-neo), specifically on their 2.7B model -- which should correspond to the smallest model in the article (Ada) that produced just pure garbage. The result is surprisingly good:

1. How did this little blossom happen? When are you going to bloom? 2. Have you ever thought about a dark horse in the running for Miss November? 3. You can spot the man who loves me by my neck – and he definitely knows it. 4. Are there any lucky cats who get to sleep in my bed every night? 5. Are black and whites everywhere? Running for the hills 6. My younger brother and I used to play a game. He’d pretend to be a bull in the pasture, and I’d pretend to be the one being held. 7. Why was this movie rated PG? Because it’s rated PG. 8. When is the last time you had to see a movie in children’s theaters? 9. You are so sexy I would hate for anyone to see you down here 10. I’d love to sleep with you right now, but I have a child with me

Not currently. The code is on GitHub, but we do not have a public-facing model. We felt that the world doesn't need another GPT-2 replica, but if there's interest we can look into doing so though. We are planning on making our GPT-3 replica public facing though.

https://github.com/EleutherAI/gpt-neo