What does HackerNews think of petals?

🌸 Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Language: Python

HuggingFace Training Cluster as a Service | Sep 2023

There's Petals[0], but the problem seems to be that the entire training data needs to be loaded into VRAM and can't be split up across devices.

[0] https://github.com/bigscience-workshop/petals

Degraded performance on ChatGPT | Aug 2023

good opportunity to try the free and totally open source Big Science Petals chat: https://chat.petals.dev/ ... Try out Stable Beluga 2 70B

I am currently running my 3090 GPU on there to help out, you can check out https://health.petals.dev/

If you have a spare GPU, consider contributing: https://github.com/bigscience-workshop/petals . I am not associated with them.

Ask HN: Cheapest hardware to run Llama 2 70B | Aug 2023

If you have a lot of money (but not H100/A100 money), get 4090s as they're currently the best bang for your buck on the CUDA side (according to George Hotz). If broke, get multiple second hand 3090s. https://timdettmers.com/2023/01/30/which-gpu-for-deep-learni.... If unwilling to spend any money at all and just want to play around with llama70b, look into petals https://github.com/bigscience-workshop/petals

Non-determinism in GPT-4 is caused by Sparse MoE | Aug 2023

Expand Context ↕

Could this work well with distributed solutions like petals?

https://github.com/bigscience-workshop/petals

I don't understand how petals can work though. I thought LLMs were typically quite monolithic.

Petals runs Llama 2 (70B) from Colab at 5 tokens/sec | Jul 2023

Chat web app: http://chat.petals.ml

Colab: https://colab.research.google.com/drive/1uCphNY7gfAUkdDrTx21...

Project description: https://github.com/bigscience-workshop/petals

Ask HN: Is it just me or GPT-4's quality has significantly deteriorated lately? | May 2023

Expand Context ↕

https://github.com/bigscience-workshop/petals seems to have some capabilities in that area, at least for fine-tuning.

Ask HN: What's the best self hosted/local alternative to GPT-4? | May 2023

Expand Context ↕

That exists: https://github.com/bigscience-workshop/petals

QLoRA: Efficient Finetuning of Quantized LLMs | May 2023

Expand Context ↕

https://github.com/bigscience-workshop/petals

Anthropic’s $5B, 4-year plan to take on OpenAI | Apr 2023

Expand Context ↕

https://github.com/bigscience-workshop/petals might be up your alley.

What it feels like to work in AI right now | Apr 2023

Expand Context ↕

I read about Petals (1) some time ago here on HN. There are surely others too, but I don't remember the names.

1. https://github.com/bigscience-workshop/petals

Could you train a ChatGPT-beating model for $85,000 and run it in a browser? | Mar 2023

Expand Context ↕

Yes there is petals/bloom https://github.com/bigscience-workshop/petals but it's not so great. Maybe it will improve or a better one will come.

Could you train a ChatGPT-beating model for $85,000 and run it in a browser? | Mar 2023

Expand Context ↕

My understanding is that can work for model inference but not for model training.

https://github.com/bigscience-workshop/petals is a project that does this kind of thing for running inference - I tried it out in Google Collab and it seemed to work pretty well.

Model training is much harder though, because it requires a HUGE amount of high bandwidth data exchange between the machines doing the training - way more than is feasible to send over anything other than a local network connection.

Show HN: A fully open-source (Apache 2.0)implementation of llama | Mar 2023

Expand Context ↕

Right, so sort of like https://github.com/bigscience-workshop/petals but for the training phase. I suppose different training runs could be proposed via a RFC type of procedure. Then it’s not only the open source model maintainers that put the effort, but also supporters of the project can “donate” their hardware resources.

Semantic Search with Phoenix, Axon, Bumblebee, and ExFaiss | Feb 2023

Expand Context ↕

The BigScience team (a working group of researchers that trained the BLOOM-176B LLM last year) released Petals [0][1] which allows distributed inference and fine-tuning of BLOOM, with the option to pick a custom model + private swarm. SWARM [2][3] is a WIP from yandex and UW that shares some of the same codebase, but is for distributed training.

[0] https://petals.ml/ [1] https://github.com/bigscience-workshop/petals [2] https://github.com/yandex-research/swarm [3] https://twitter.com/m_ryabinin/status/1625175933492641814

How to train large models on many GPUs? (2021) | Feb 2023

Might this be what you're looking for: https://github.com/bigscience-workshop/petals ?

Bard and new AI features in Search | Feb 2023

Expand Context ↕

Hey look, a decentralized chatbot running BLOOMZ-176B (an open source LLM about the size of GPT-3)

http://chat.petals.ml

I'm contributing to the project by running a node in my garage with a single RTX 3060ti in it, and you can too: https://github.com/bigscience-workshop/petals

It's early days, but the tech is super promising.

Open Assistant – project meant to give everyone access to a great chat based LLM | Feb 2023

Expand Context ↕

That already exists - https://github.com/bigscience-workshop/petals

Open Assistant – project meant to give everyone access to a great chat based LLM | Feb 2023

Expand Context ↕

you are probably thinking of https://arxiv.org/abs/2207.03481

for inference, there is https://github.com/bigscience-workshop/petals

however, both are only in the research phase. start tinkering!

Let's build GPT: from scratch, in code, spelled out by Andrej Karpathy [video] | Jan 2023

Expand Context ↕

For a distributed computing/BitTorrent-style method of running these LLMs, see: https://github.com/bigscience-workshop/petals.

GPT-3 is the best journal I’ve used | Jan 2023

Expand Context ↕

Actually there is the Petals [0] project which does that (I think it's more for inference and fine-tuning, not full training) and might be the best current approach for "self-hosted LLM", though it defeats the point of privacy/anonymity since everyone in the distributed swarm have access to your data

[0] https://github.com/bigscience-workshop/petals

Ask HN: Can you crowdfund the compute for GPT? | Jan 2023

Yes: https://github.com/bigscience-workshop/petals

There’s now an open source alternative to ChatGPT, but good luck running it | Dec 2022

Would it be possible to integrate RLHF with BLOOM Petals [0]? Petals lets you run the huge BLOOM model (175B params) with a distributed swarm of machines. iirc it already supports fine-tuning BLOOM, so.. maybe?

[0] https://github.com/bigscience-workshop/petals

Ask HN: Can I download GPT / ChatGPT to my desktop? | Dec 2022

https://github.com/bigscience-workshop/petals

Since my other account is shadow banned for some unexplained reason, I just wanted to mention the petal project. It's an attempt to bittorrent style distribute the load of running these large models. Good luck!

Petals is creating a free, distributed network for running text-generating AI | Dec 2022

https://github.com/bigscience-workshop/petals