I don't know the licensing and all that jazz (even if you self-host for your personal use it shouldn't matter). But, this paper[0] released a week ago claims " 99.3% of the performance level of ChatGPT while only requiring 24 hours of finetuning on a single GPU" (QLORA).

A quick test of the huggingface demo gives reasonable results[1]. The actual model behind the space is here[2], and should be self-hostable with reasonable effort.

0. https://arxiv.org/abs/2305.14314 1. https://huggingface.co/spaces/uwnlp/guanaco-playground-tgi 2. https://huggingface.co/timdettmers/guanaco-33b-merged

Guanaco is indeed very capable and can replace GPT 3.5 in almost all scenarios, based on my tests.

Easy way to self-host it is to use text-generation-webui[1] and 33B 4-bit quantized GGML model from TheBloke[2].

[1] https://github.com/oobabooga/text-generation-webui

[2] https://huggingface.co/TheBloke/guanaco-33B-GGML