Coincidentally this comes after MosaicML released the best open source commercially usable LLMs on huggingface: mpt-30b, the first open source LLM with 8k context length that can be extended even further with ALiBi and has been trained on a whopping 1 trillion tokens vs. 300 billion for Pythia and OpenLLaMA, and 800 billion for StableLM.

OpenLLaMA models up to 13B parameters have now been trained on 1T tokens:

https://github.com/openlm-research/open_llama