What does HackerNews think of open_llama?
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
> For current version of OpenLLaMA models, our tokenizer is trained to merge multiple empty spaces into one before tokenization, similar to T5 tokenizer. Because of this, our tokenizer will not work with code generation tasks (e.g. HumanEval) since code involves many empty spaces. We are planning to open source long context models trained on more code data. Stay tuned.
Sounds like they're working on it.
> Update 05/22/2023
> We are happy to release our 700B token checkpoint for the OpenLLaMA 7B model and 600B token checkpoint for the 3B model. We’ve also updated the evaluation results. We expect the full 1T token training run to finish at the end of this week.
https://github.com/openlm-research/open_llama
So we could develop on LLaMA for now and switch to OpenLLaMA later.
Feedback for /u/bayes-song - it'd be great to have a more info on the model card on HF - right now it's unclear the parameter count, # of total tokens you're planning on training on/how many you've trained on so far. An Evaluation section (maybe using lm-evaluation-harness) might be good as well?