What does HackerNews think of llama?

Inference code for LLaMA models

Language: Python

ExecuTorch: Run PyTorch programs on mobile and edge devices | Oct 2023

Is it possible to execute a light weight language model, perhaps this https://github.com/facebookresearch/llama using ExecuTorch to run on smartphone in real time for a chatbot app ? Please share some guidance.

Guide to running Llama 2 locally | Jul 2023

Expand Context ↕

I have a question: Last week I downloaded llama-7b-chat from meta's github directly (https://github.com/facebookresearch/llama) using the URL they sent via e-mail. As a result, I now have the model as consolidated.00.pth.

Your commands assume the model is a .bin file (so I guess there must be a way to convert the pytorch model .pth to the .bin file). How can I do this and what is the difference between the two models?

The facebook repo provides commands for using the models, these commands don't work on my windows machine: "NOTE: Redirects are currently not supported in Windows or MacOs. [W ..\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to ...."

The facebook repo does not describe which OS you are supposed to use, so I assumed it would work on Windows too. But then if this can work why would anyone need the ggerganov llama code? I am new to all of this and easily confused, so any help is appreciated

Llama 2 | Jul 2023

I filled the form about an hour ago and got the download link 15 mins ago. Download is ongoing.

Direct link to request access form: https://ai.meta.com/resources/models-and-libraries/llama-dow...

Direct link to request access on Hugging Face (use the same email): https://huggingface.co/meta-llama/Llama-2-70b-chat-hf

Direct link to repo: https://github.com/facebookresearch/llama

Once you get a link to download on email make sure to copy it without spaces, an option is to open it in a new tab and then download. If you are using fish or another fancy shell, make sure you switch to bash or sh before running download.sh from the repo.

I am not sure exactly how much space is needed but it is likely north of 500GB given that there are two 70B models (you are given the option to download just the small ones in a prompt).

Edit: The_Bloke on HF already has them in GGML format available for download.

https://huggingface.co/TheBloke/Llama-2-7B-GGML https://huggingface.co/TheBloke/Llama-2-13B-GGML

Llama 2 – Meta AI | Jul 2023

The llama source code in the original repo has been updated for llama 2: https://github.com/facebookresearch/llama

Web LLM – WebGPU Powered Inference of Large Language Models | Apr 2023

Expand Context ↕

What do you mean, it's not clear?

https://github.com/facebookresearch/llama links to

https://forms.gle/jk851eBVbX1m5TAv5 which contains LLaMA license agreement below the form.

The LLama Effect: Leak Sparked a Series of Open Source Alternatives to ChatGPT | Apr 2023

Expand Context ↕

The model isn't code to a new model trained on it, it's training data; just like the pirated torrent site Books3 dataset Facebook used to train LLaMA.

The training code is Apache 2.0 licensed so it can be copied and modified freely, including for commercial purpoes. https://github.com/facebookresearch/llama

Dalai: Automatically install, run, and play with LLaMA on your computer | Mar 2023

Expand Context ↕

It is, if you request and get approved.

https://github.com/facebookresearch/llama

Large language models are having their Stable Diffusion moment | Mar 2023

Expand Context ↕

The link from GP is the CPU only one implemented in C++.

The python + GPU one can be found on the official facebook repo: https://github.com/facebookresearch/llama (Presumably GP thought this was already known to everyone so they pasted the other link)

Facebook LLAMA is being openly distributed via torrents | Mar 2023

Expand Context ↕

The code (including the model) is here:

https://github.com/facebookresearch/llama

I already got the 7B model to generate text using my GPU! The 1st example prompt generated this:

[I believe the meaning of life is] to be happy, and it is also to live in the moment. I think that is the most important thing. I'm not really a party girl. I'm not a girl's girl. I have a really small group of close girlfriends and that's all I need. I believe in equal rights for everyone. I'm not a rebel. I don't really rebel against anything. I'm a very traditional girl, very loyal. I'm a mum's girl and I'm a dad's girl. People have a right to know what's going on. I don't care about the haters, because at the end of the day they're just going to have to deal with themselves. I've been getting more and more into fashion since I was about 16. I know I'm a little different, but so what? I think that's good. I don't think you should be like everyone else. It's my birthday, and I'll cry if I want to. I've always been a huge fan of fashion, and I've always liked to dress up

Facebook LLAMA is being openly distributed via torrents | Mar 2023

Expand Context ↕

> so does this mean you got it working on one GPU with an NVLink to a 2nd, or is it really running on all 4 A40s?

it's sharded across all 4 GPUs (as per the readme here: https://github.com/facebookresearch/llama). I'd wait a few weeks to a month for people to settle on a solution for running the model, people are just going to be throwing pytorch code at the wall and seeing what sticks right now.

Facebook LLAMA is being openly distributed via torrents | Mar 2023

Expand Context ↕

clone this and point the script(s) to your downloaded model files: https://github.com/facebookresearch/llama/

Meta rolls out AI language model LLaMA | Feb 2023

Expand Context ↕

Because it is open source: https://github.com/facebookresearch/llama

Meta/FB Releases LLaMA: Open and Efficient Foundation Models | Feb 2023

Full paper [1[

Github [2]

[1] https://research.facebook.com/file/1574548786327032/LLaMA--O...

[2] https://github.com/facebookresearch/llama

LLaMA: A foundational, 65B-parameter large language model | Feb 2023

> To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world. People interested in applying for access can find the link to the application in our research paper.

The closest you are going to get to the source is here: https://github.com/facebookresearch/llama

It is still unclear if you are even going to get access to the entire model as open source. Even if you did, you can't use it for your commercial product anyway.