the cog template is just starter code to make it super simple to deploy llama-v2 on any infrastructure of your choosing!

More about cog https://github.com/replicate/cog

Our thinking was just that a bunch of folks will want to fine-tune right away, then deploy the fine-tunes, so trying to make that easy... Or even just deploy the models-as-is on their own infra without dealing with CUDA insanity!