What do you use to host these models (like Vicuna, Dolly etc) on your own server and expose them using HTTP REST API? Is there an Heroku-like for LLM models?

I am looking for an open source models to do text summarization. Open AI is too expensive for my use case because I need to pass lots of tokens.