What does HackerNews think of gpt-neox?

LangChain: Build AI apps with LLMs through composability | Jan 2023

What LLMs does LangChain support?

Btw I asked chat.langchain.dev and it said:

> LangChain uses pre-trained models from Hugging Face, such as BERT, GPT-2, and XLNet. For more information, please see the Getting Started Documentation[0].

That links to a 404, but I did find the correct link[1]. Oddly that doc only mentions an OpenAI API wrapper. I couldn’t find anything about the other models from huggingface.

Does LangChain have any tooling around fine tuning pre-trained LLMs like GPTNeoX[2]?

[0]https://langchain.readthedocs.io/en/latest/getting_started.h...

[1]https://langchain.readthedocs.io/en/latest/getting_started/g...

[2]https://github.com/EleutherAI/gpt-neox

First Open Source Alternative to ChatGPT Has Arrived | Dec 2022

For context, they're now working on a GPU driven version:

https://github.com/EleutherAI/gpt-neox/

GPT-NeoX | Dec 2022

Demo:

- https://20b.eleuther.ai/

Related system/hardware requirements:

- https://nlpcloud.com/deploying-gpt-neox-20-production-focus-...

Benchmarks of GPT-NeoX-20B vs GPT-3 DaVinci:

- https://the-decoder.com/gpt-3-alternative-eleutherai-release...

GitHub Download:

- https://github.com/EleutherAI/gpt-neox

Could an open source/data GPT like chatbot be created? | Dec 2022

Yes. EleutherAI is doing it, probably one of many:

https://www.eleuther.ai/projects/gpt-neox/ https://github.com/EleutherAI/gpt-neox https://arxiv.org/abs/2204.06745

They have a 20B parameter model. I think the primary dataset for these open models is The Pile: https://arxiv.org/abs/2101.00027 (web scrape, pubmed, arxiv, github, wikipedia, etc. There is a nice diagram on page 2 that summarizes the contents.)

Ask HN: When/where can we get an offline GPT3 type chatbot? | Dec 2022

Here is an example of one general purpose open source LLM, probably the best you can get:

https://github.com/EleutherAI/gpt-neox

To manage your expectations it is nowhere as good as ChatGPT.

If you are interested in programming only:

https://github.com/salesforce/CodeGen

is decent.

GPT-3 can create both sides of an Interactive Fiction transcript | Oct 2022

Expand Context ↕

You could follow EleutherAI's official guide: https://github.com/EleutherAI/gpt-neox You could also use a hosted service that proposes GPT-NeoX like https://nlpcloud.com or https://goose.ai

YaLM-100B: Pretrained language model with 100B parameters | Jun 2022

Expand Context ↕

The code is there: https://github.com/EleutherAI/gpt-neox

Ask HN: Why are the likes of GPT3 and DALL-E not open to everyone? | Jun 2022

Money. There's a lot of other reasons to hide behind like safety and such but really it is just money in the end. These models are expensive to train and why not try to profit off them if they are truly useful.

Suggest looking into GPT-NeoX and GPT-J instead.

https://6b.eleuther.ai/

https://github.com/EleutherAI/gpt-neox/

GPT Neo: open-source GPT model, with pretrained 1.3B & 2.7B weight models | Mar 2021

Expand Context ↕

There's a PyTorch + DeepSpeed repository here: https://github.com/EleutherAI/gpt-neox

GPT Neo: open-source GPT model, with pretrained 1.3B & 2.7B weight models | Mar 2021

GPT-NeoX, which is a model from the same group but using GPUs instead of TPUs, uses techniques from DeepSpeed:

https://github.com/EleutherAI/gpt-neox/

GPT Neo: open-source GPT-3-like model, with pretrained smaller models available | Mar 2021

Expand Context ↕

Yeah. They say they are doing a 10B release soon[1].

I suspect they have run into training issues since they are moving to a new repo[2]

[1] https://twitter.com/arankomatsuzaki/status/13737326468119674...

[2] https://github.com/EleutherAI/gpt-neox/

Zero-3 Offload: Scale DL models to trillion parameters without code changes | Mar 2021

GPT-NeoX is an example project that is using deepspeed and Zero-3 offloading. The wider project intend to train a GPT-3 sized model and release it freely to the world.

https://github.com/EleutherAI/gpt-neox