Have a question to the Generative AI experts here.

So, I can use smthg like GPT-4 to label data and then use that as a train set for my own LLM, right?

EDIT: adding this from OpenAI Restriction TOS: "(iii) use output from the Services to develop models that compete with OpenAI;"

That is against their ToS though if you use your new LLM commercially.

So what are they going to do about it?

Great question! I don’t know the end game there. Maybe if they suspected their model was used they would sue, and in discovery find you used their model for training?

Maybe we don't need to worry, OpenLLaMA is under training right now. It will be the commercial version of LLaMA.

> Update 05/22/2023

> We are happy to release our 700B token checkpoint for the OpenLLaMA 7B model and 600B token checkpoint for the 3B model. We’ve also updated the evaluation results. We expect the full 1T token training run to finish at the end of this week.

https://github.com/openlm-research/open_llama

So we could develop on LLaMA for now and switch to OpenLLaMA later.