Very cool! Could you add in the readme how to install llama 2 on the same machine ?
How do you plan on deploying llama 2? Is that via ollama/fastchat/etc.? We're actively building out integrations to different providers, so if you have any preferred tooling - let me know!
No idea yet. Which you recommend to start? It will be hosted on a Ubuntu server (Digital Ocean, Linode etc..)
There's a lot of good model deployment platforms that would make it easy to call your model behind a hosted endpoint
-- If you do want to self-host - there's some great libraries like https://github.com/lm-sys/FastChat and https://github.com/ggerganov/llama.cpp that might be helpful
If none of these really solve your issue - feel free to email me and I'm happy to help you figure something out - [email protected]