Coincidentally this comes after MosaicML released the best open source commercially usable LLMs on huggingface: mpt-30b, the first open source LLM with 8k context length that can be extended even further with ALiBi and has been trained on a whopping 1 trillion tokens vs. 300 billion for Pythia and OpenLLaMA, and 800 billion for StableLM.
OpenLLaMA models up to 13B parameters have now been trained on 1T tokens: